Tackling Virtual and Real Concept Drifts: An Adaptive Gaussian Mixture Model Approach
Real-world applications have been dealing with large amounts of data that arrive over time and generally present changes in their underlying joint probability distribution, i.e., concept drift. Concept drift can be subdivided into two types: virtual drift, which affects the unconditional probability...
Saved in:
Published in | IEEE transactions on knowledge and data engineering Vol. 35; no. 2; pp. 2048 - 2060 |
---|---|
Main Authors | , , |
Format | Journal Article |
Language | English |
Published |
New York
IEEE
01.02.2023
The Institute of Electrical and Electronics Engineers, Inc. (IEEE) |
Subjects | |
Online Access | Get full text |
Cover
Loading…
Summary: | Real-world applications have been dealing with large amounts of data that arrive over time and generally present changes in their underlying joint probability distribution, i.e., concept drift. Concept drift can be subdivided into two types: virtual drift, which affects the unconditional probability distribution <inline-formula><tex-math notation="LaTeX">p(\boldsymbol{x})</tex-math> <mml:math><mml:mrow><mml:mi>p</mml:mi><mml:mo>(</mml:mo><mml:mi mathvariant="bold">x</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:math><inline-graphic xlink:href="oliveira-ieq1-3099690.gif"/> </inline-formula>, and real drift, which affects the conditional probability distribution <inline-formula><tex-math notation="LaTeX">p(y|\boldsymbol{x})</tex-math> <mml:math><mml:mrow><mml:mi>p</mml:mi><mml:mo>(</mml:mo><mml:mi>y</mml:mi><mml:mo>|</mml:mo><mml:mi mathvariant="bold">x</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:math><inline-graphic xlink:href="oliveira-ieq2-3099690.gif"/> </inline-formula>. Existing work focuses on real drift. However, strategies to cope with real drift may not be the best suited for dealing with virtual drift, since the real class boundaries remain unchanged. We provide the first in depth analysis of the differences between the impact of virtual and real drifts on classifiers' suitability. We propose an approach to handle both drifts called On-line Gaussian Mixture Model With Noise Filter For Handling Virtual and Real Concept Drifts (OGMMF-VRD). Experiments with seven synthetics and seven real-world datasets show that OGMMF-VRD outperforms other approaches with separate mechanisms to deal with virtual and real drifts. It also has more stable rankings and smaller drops in performance during drifting periods than existing ensemble approaches, thus, being more reliable for adoption in practice. |
---|---|
ISSN: | 1041-4347 1558-2191 |
DOI: | 10.1109/TKDE.2021.3099690 |