An Empirical Study on Anomaly Detection Algorithms for Extremely Imbalanced Datasets

Anomaly detection attempts to identify abnormal events that deviate from normality. Since such events are often rare, data related to this domain is usually imbalanced. In this paper, we compare diverse preprocessing and Machine Learning (ML) state-of-the-art algorithms that can be adopted within th...

Full description

Saved in:
Bibliographic Details
Published inArtificial Intelligence Applications and Innovations Vol. AICT-646; no. Part I; pp. 85 - 95
Main Authors Fontes, Gonçalo, Matos, Luís Miguel, Matta, Arthur, Pilastri, André, Cortez, Paulo
Format Book Chapter Conference Proceeding
LanguageEnglish
Published Cham Springer International Publishing 2022
SeriesIFIP Advances in Information and Communication Technology
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:Anomaly detection attempts to identify abnormal events that deviate from normality. Since such events are often rare, data related to this domain is usually imbalanced. In this paper, we compare diverse preprocessing and Machine Learning (ML) state-of-the-art algorithms that can be adopted within this anomaly detection context. These include two unsupervised learning algorithms, namely Isolation Forests (IF) and deep dense AutoEncoders (AE), and two supervised learning approaches, namely Random Forest and an Automated ML (AutoML) method. Several empirical experiments were conducted by adopting seven extremely imbalanced public domain datasets. Overall, the IF and AE unsupervised methods obtained competitive anomaly detection results, which also have the advantage of not requiring labeled data.
ISBN:9783031083327
3031083326
ISSN:1868-4238
1868-422X
DOI:10.1007/978-3-031-08333-4_7