Anomaly detection by robust statistics

Real data often contain anomalous cases, also known as outliers. These may spoil the resulting analysis but they may also contain valuable information. In either case, the ability to detect such anomalies is essential. A useful tool for this purpose is robust statistics, which aims to detect the out...

Full description

Saved in:

Bibliographic Details
Published in	Wiley interdisciplinary reviews. Data mining and knowledge discovery Vol. 8; no. 2
Main Authors	Rousseeuw, Peter J., Hubert, Mia
Format	Journal Article
Language	English
Published	Hoboken, USA Wiley Periodicals, Inc 01.03.2018
Online Access	Get full text

Cover

Loading…

More Information
Summary:	Real data often contain anomalous cases, also known as outliers. These may spoil the resulting analysis but they may also contain valuable information. In either case, the ability to detect such anomalies is essential. A useful tool for this purpose is robust statistics, which aims to detect the outliers by first fitting the majority of the data and then flagging data points that deviate from it. We present an overview of several robust methods and the resulting graphical outlier detection tools. We discuss robust procedures for univariate, low‐dimensional, and high‐dimensional data, such as estimating location and scatter, linear regression, principal component analysis, classification, clustering, and functional data analysis. Also the challenging new topic of cellwise outliers is introduced. WIREs Data Mining Knowl Discov 2018, 8:e1236. doi: 10.1002/widm.1236 This article is categorized under: Algorithmic Development > Spatial and Temporal Data Mining Technologies > Classification Technologies > Structure Discovery and Clustering Technologies > Visualization Fitting a line to data with outliers: classical (red) and robust (blue).
ISSN:	1942-4787 1942-4795
DOI:	10.1002/widm.1236