Density peaks clustering algorithm based on fuzzy and weighted shared neighbor for uneven density datasets

•A new DPC algorithm for uneven density datasets is proposed.•A new local density calculation method based on fuzzy neighborhood is designed.•A new allocation strategy based on weighted shared nearest neighbor is proposed.•The new DPC algorithm has excellent clustering accuracy for different types o...

Full description

Saved in:
Bibliographic Details
Published inPattern recognition Vol. 139; p. 109406
Main Authors Zhao, Jia, Wang, Gang, Pan, Jeng-Shyang, Fan, Tanghuai, Lee, Ivan
Format Journal Article
LanguageEnglish
Published Elsevier Ltd 01.07.2023
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:•A new DPC algorithm for uneven density datasets is proposed.•A new local density calculation method based on fuzzy neighborhood is designed.•A new allocation strategy based on weighted shared nearest neighbor is proposed.•The new DPC algorithm has excellent clustering accuracy for different types of datasets. Uneven density data refers to data with a certain difference in sample density between clusters. The local density of density peaks clustering algorithm (DPC) does not consider the effect of sample density difference between clusters of uneven density data, which may lead to wrong selection of cluster centers; the algorithm allocation strategy makes it easy to incorrectly allocate samples originally belonging to sparse clusters to dense clusters, which reduces clustering efficiency. In this study, we proposed the density peaks clustering algorithm based on fuzzy and weighted shared neighbor for uneven density datasets (DPC-FWSN). First, a nearest neighbor fuzzy kernel function is obtained by combining K-nearest neighbor and fuzzy neighborhood. Then, local density is redefined by the nearest neighbor fuzzy kernel function. The local density can better characterize the distribution characteristics of the sample by balancing the contribution of sample density in dense and sparse areas, in order to avoid the situation that the sparse cluster does not have a cluster center. Finally, the allocation strategy for weighted shared neighbor similarity is proposed to optimize the sample allocation at the boundary of the sparse cluster. Experiments are performed on IDPC-FA, FKNN-DPC, FNDPC, DPCSA and DPC for uneven density datasets, complex morphologies datasets and real datasets. The clustering results demonstrate that DPC-FWSN effectively handles datasets with uneven density distribution.
ISSN:0031-3203
1873-5142
DOI:10.1016/j.patcog.2023.109406