Outlier Detection and Explanation Method Based on FOLOF Algorithm
Outlier mining constitutes an essential aspect of modern data analytics, focusing on the identification and interpretation of anomalous observations. Conventional density-based local outlier detection methodologies frequently exhibit limitations due to their inherent lack of data preprocessing capab...
Saved in:
Published in | Entropy (Basel, Switzerland) Vol. 27; no. 6; p. 582 |
---|---|
Main Authors | , , |
Format | Journal Article |
Language | English |
Published |
Switzerland
MDPI AG
30.05.2025
MDPI |
Subjects | |
Online Access | Get full text |
Cover
Loading…
Summary: | Outlier mining constitutes an essential aspect of modern data analytics, focusing on the identification and interpretation of anomalous observations. Conventional density-based local outlier detection methodologies frequently exhibit limitations due to their inherent lack of data preprocessing capabilities, consequently demonstrating degraded performance when applied to novel or heterogeneous datasets. Moreover, the computation of the outlier factor for each sample in these algorithms results in considerably higher computational cost, especially in the case of large datasets. This paper introduces a local outlier detection method named FOLOF (FCM Objective Function-based LOF) through an examination of existing algorithms. The approach starts by applying the elbow rule to determine the optimal number of clusters in the dataset. Subsequently, the FCM objective function is employed to prune the dataset to extract a candidate set of outliers. Finally, a weighted local outlier factor detection algorithm computes the degree of anomaly for each sample in the candidate set. For the analysis, the Golden Section method was used to classify the outliers. The underlying causes of these outliers can be revealed by exploring the anomalous properties of each outlier data point through the outlier factors of each dimension property. This approach has been validated on artificial datasets, the UCI dataset, and an NBA player dataset to demonstrate its effectiveness. |
---|---|
Bibliography: | ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14 content type line 23 |
ISSN: | 1099-4300 1099-4300 |
DOI: | 10.3390/e27060582 |