Tackling Virtual and Real Concept Drifts: An Adaptive Gaussian Mixture Model Approach

Real-world applications have been dealing with large amounts of data that arrive over time and generally present changes in their underlying joint probability distribution, i.e., concept drift. Concept drift can be subdivided into two types: virtual drift, which affects the unconditional probability...

Full description

Saved in:
Bibliographic Details
Published inIEEE transactions on knowledge and data engineering Vol. 35; no. 2; pp. 2048 - 2060
Main Authors Oliveira, Gustavo H. F. M., Minku, Leandro L., Oliveira, Adriano L. I.
Format Journal Article
LanguageEnglish
Published New York IEEE 01.02.2023
The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:Real-world applications have been dealing with large amounts of data that arrive over time and generally present changes in their underlying joint probability distribution, i.e., concept drift. Concept drift can be subdivided into two types: virtual drift, which affects the unconditional probability distribution <inline-formula><tex-math notation="LaTeX">p(\boldsymbol{x})</tex-math> <mml:math><mml:mrow><mml:mi>p</mml:mi><mml:mo>(</mml:mo><mml:mi mathvariant="bold">x</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:math><inline-graphic xlink:href="oliveira-ieq1-3099690.gif"/> </inline-formula>, and real drift, which affects the conditional probability distribution <inline-formula><tex-math notation="LaTeX">p(y|\boldsymbol{x})</tex-math> <mml:math><mml:mrow><mml:mi>p</mml:mi><mml:mo>(</mml:mo><mml:mi>y</mml:mi><mml:mo>|</mml:mo><mml:mi mathvariant="bold">x</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:math><inline-graphic xlink:href="oliveira-ieq2-3099690.gif"/> </inline-formula>. Existing work focuses on real drift. However, strategies to cope with real drift may not be the best suited for dealing with virtual drift, since the real class boundaries remain unchanged. We provide the first in depth analysis of the differences between the impact of virtual and real drifts on classifiers' suitability. We propose an approach to handle both drifts called On-line Gaussian Mixture Model With Noise Filter For Handling Virtual and Real Concept Drifts (OGMMF-VRD). Experiments with seven synthetics and seven real-world datasets show that OGMMF-VRD outperforms other approaches with separate mechanisms to deal with virtual and real drifts. It also has more stable rankings and smaller drops in performance during drifting periods than existing ensemble approaches, thus, being more reliable for adoption in practice.
ISSN:1041-4347
1558-2191
DOI:10.1109/TKDE.2021.3099690