Adaptive Density Peaks Clustering Based on K-Nearest Neighbor and Gini Coefficient

Density Peaks Clustering (DPC) is a density-based clustering algorithm that has the advantage of not requiring clustering parameters and detecting non-spherical clusters. The density peaks algorithm obtains the actual cluster center by inputting the cutoff distance and manually selecting the cluster...

Full description

Saved in:
Bibliographic Details
Published inIEEE access Vol. 8; pp. 113900 - 113917
Main Authors Jiang, Dong, Zang, Wenke, Sun, Rui, Wang, Zehua, Liu, Xiyu
Format Journal Article
LanguageEnglish
Published Piscataway IEEE 2020
The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:Density Peaks Clustering (DPC) is a density-based clustering algorithm that has the advantage of not requiring clustering parameters and detecting non-spherical clusters. The density peaks algorithm obtains the actual cluster center by inputting the cutoff distance and manually selecting the cluster center. Thus, the clustering center point is not selected on the basis of considering the whole data set. This paper proposes a method called G-KNN-DPC to calculate the cutoff distance based on the Gini coefficient and K-nearest neighbor. G-KNN-DPC first finds the optimal cutoff distance with Gini coefficient, and then the center point with the K-nearest neighbor. The automatic clustering center method can not only avoid the error that a cluster detects two center points but also effectively solve the traditional DPC algorithm defect that cannot handle complex data sets. Compared with DPC, Fuzzy C-Means, K-means, KDPC and DBSCAN, the proposed algorithm creates better clusters on different data sets.
ISSN:2169-3536
2169-3536
DOI:10.1109/ACCESS.2020.3003057