An Empirical Study on Anomaly Detection Algorithms for Extremely Imbalanced Datasets
Anomaly detection attempts to identify abnormal events that deviate from normality. Since such events are often rare, data related to this domain is usually imbalanced. In this paper, we compare diverse preprocessing and Machine Learning (ML) state-of-the-art algorithms that can be adopted within th...
Saved in:
Published in | Artificial Intelligence Applications and Innovations Vol. AICT-646; no. Part I; pp. 85 - 95 |
---|---|
Main Authors | , , , , |
Format | Book Chapter Conference Proceeding |
Language | English |
Published |
Cham
Springer International Publishing
2022
|
Series | IFIP Advances in Information and Communication Technology |
Subjects | |
Online Access | Get full text |
Cover
Loading…
Summary: | Anomaly detection attempts to identify abnormal events that deviate from normality. Since such events are often rare, data related to this domain is usually imbalanced. In this paper, we compare diverse preprocessing and Machine Learning (ML) state-of-the-art algorithms that can be adopted within this anomaly detection context. These include two unsupervised learning algorithms, namely Isolation Forests (IF) and deep dense AutoEncoders (AE), and two supervised learning approaches, namely Random Forest and an Automated ML (AutoML) method. Several empirical experiments were conducted by adopting seven extremely imbalanced public domain datasets. Overall, the IF and AE unsupervised methods obtained competitive anomaly detection results, which also have the advantage of not requiring labeled data. |
---|---|
ISBN: | 9783031083327 3031083326 |
ISSN: | 1868-4238 1868-422X |
DOI: | 10.1007/978-3-031-08333-4_7 |