Outlier Detection and Explanation Method Based on FOLOF Algorithm

Outlier mining constitutes an essential aspect of modern data analytics, focusing on the identification and interpretation of anomalous observations. Conventional density-based local outlier detection methodologies frequently exhibit limitations due to their inherent lack of data preprocessing capab...

Full description

Saved in:

Bibliographic Details
Published in	Entropy (Basel, Switzerland) Vol. 27; no. 6; p. 582
Main Authors	Bai, Lei, Wang, Jiasheng, Zhou, Yu
Format	Journal Article
Language	English
Published	Switzerland MDPI AG 30.05.2025 MDPI
Subjects	Algorithms Analysis Classification Clustering Data analysis Data points Datasets golden section Knowledge Methods Neighborhoods objective function outlier analysis outlier detection outlier factor Outliers (statistics) Performance degradation prune outlier factor outlier analysis prune outlier detection golden section objective function
Online Access	Get full text

Cover

Loading…

More Information
Summary:	Outlier mining constitutes an essential aspect of modern data analytics, focusing on the identification and interpretation of anomalous observations. Conventional density-based local outlier detection methodologies frequently exhibit limitations due to their inherent lack of data preprocessing capabilities, consequently demonstrating degraded performance when applied to novel or heterogeneous datasets. Moreover, the computation of the outlier factor for each sample in these algorithms results in considerably higher computational cost, especially in the case of large datasets. This paper introduces a local outlier detection method named FOLOF (FCM Objective Function-based LOF) through an examination of existing algorithms. The approach starts by applying the elbow rule to determine the optimal number of clusters in the dataset. Subsequently, the FCM objective function is employed to prune the dataset to extract a candidate set of outliers. Finally, a weighted local outlier factor detection algorithm computes the degree of anomaly for each sample in the candidate set. For the analysis, the Golden Section method was used to classify the outliers. The underlying causes of these outliers can be revealed by exploring the anomalous properties of each outlier data point through the outlier factors of each dimension property. This approach has been validated on artificial datasets, the UCI dataset, and an NBA player dataset to demonstrate its effectiveness.
Bibliography:	ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14 content type line 23
ISSN:	1099-4300 1099-4300
DOI:	10.3390/e27060582