On the Noise Resilience of Ranking Measures

Performance measures play a pivotal role in the evaluation and selection of machine learning models for a wide range of applications. Using both synthetic and real-world data sets, we investigated the resilience to noise of various ranking measures. Our experiments revealed that the area under the R...

Full description

Saved in:

Bibliographic Details
Published in	Neural Information Processing Vol. 9948; pp. 47 - 55
Main Author	Berrar, Daniel
Format	Book Chapter
Language	English
Published	Switzerland Springer International Publishing AG 2016 Springer International Publishing
Series	Lecture Notes in Computer Science
Subjects	AUC Classification H-measure Noise Precision-recall curve Ranking Robustness ROC curve taKS
Online Access	Get full text

Cover

Loading…

More Information
Summary:	Performance measures play a pivotal role in the evaluation and selection of machine learning models for a wide range of applications. Using both synthetic and real-world data sets, we investigated the resilience to noise of various ranking measures. Our experiments revealed that the area under the ROC curve (AUC) and a related measure, the truncated average Kolmogorov-Smirnov statistic (taKS), can reliably discriminate between models with truly different performance under various types and levels of noise. With increasing class skew, however, the H-measure and estimators of the area under the precision-recall curve become preferable measures. Because of its simple graphical interpretation and robustness, the lower trapezoid estimator of the area under the precision-recall curve is recommended for highly imbalanced data sets.
ISBN:	3319466712 9783319466712
ISSN:	0302-9743 1611-3349
DOI:	10.1007/978-3-319-46672-9_6