Can we gauge data drift in streaming data by testing different uncertainty estimation methods?

Original title: An Empirical Study of Uncertainty Estimation Techniques for Detecting Drift in Data Streams

Authors: Anton Winter, Nicolas Jourdan, Tristan Wirth, Volker Knauthe, Arjan Kuijper

The study delves into ensuring the reliability of machine learning models in high-stakes fields like medical diagnosis and autonomous driving, where accuracy is crucial. They tackle the issue of concept drift, where models degrade over time due to changing data patterns. Instead of relying on scarce labeled data, they explore using uncertainty estimates to spot these shifts. Testing five uncertainty methods with the ADWIN drift detector on seven real-world datasets, they find that while the SWAG method is well-calibrated, even basic methods hold competitive performance in detecting drifts. Surprisingly, the choice of uncertainty method doesn’t drastically affect overall accuracy. This research illuminates the potential of using uncertainty-based methods for spotting concept drift in critical real-world applications, offering a promising avenue to maintain model reliability.

Original article: https://arxiv.org/abs/2311.13374