Anomaly Detection: How to Artificially Increase Your F1-Score with a Biased Evaluation Protocol
Anomaly detection is a widely explored domain in machine learning. Many models are proposed in the literature, and compared through different metrics measured on various datasets. The most popular metrics used to compare performances are F1-score, AUC and AVPR. In this paper, we show that F1-score a...
Saved in:
Published in | Machine Learning and Knowledge Discovery in Databases. Applied Data Science Track Vol. 12978; pp. 3 - 18 |
---|---|
Main Authors | , , , |
Format | Book Chapter |
Language | English |
Published |
Switzerland
Springer International Publishing AG
2021
Springer International Publishing |
Series | Lecture Notes in Computer Science |
Subjects | |
Online Access | Get full text |
Cover
Loading…
Summary: | Anomaly detection is a widely explored domain in machine learning. Many models are proposed in the literature, and compared through different metrics measured on various datasets. The most popular metrics used to compare performances are F1-score, AUC and AVPR. In this paper, we show that F1-score and AVPR are highly sensitive to the contamination rate. One consequence is that it is possible to artificially increase their values by modifying the train-test split procedure. This leads to misleading comparisons between algorithms in the literature, especially when the evaluation protocol is not well detailed. Moreover, we show that the F1-score and the AVPR cannot be used to compare performances on different datasets as they do not reflect the intrinsic difficulty of modeling such data. Based on these observations, we claim that F1-score and AVPR should not be used as metrics for anomaly detection. We recommend a generic evaluation procedure for unsupervised anomaly detection, including the use of other metrics such as the AUC, which are more robust to arbitrary choices in the evaluation protocol. |
---|---|
Bibliography: | D. Fourure, M. U. Javaid, N. Posocco and S. Tihon—Equal contribution. |
ISBN: | 9783030865139 3030865134 |
ISSN: | 0302-9743 1611-3349 |
DOI: | 10.1007/978-3-030-86514-6_1 |