Sparsity Fuzzy C-Means Clustering with Principal Component Analysis Embedding

The clustering method has been widely used in data mining, pattern recognition, and image identification. Fuzzy c-means (FCM) is a soft clustering method that introduces the concept of membership. In this method, the fuzzy membership matrix is obtained by calculating the distance between data points...

Full description

Saved in:

Bibliographic Details
Published in	IEEE transactions on fuzzy systems Vol. 31; no. 7; pp. 1 - 13
Main Authors	Chen, Jingwei, Zhu, Jianyong, Jiang, Hongyun, Yang, Hui, Nie, Feiping
Format	Journal Article
Language	English
Published	New York IEEE 01.07.2023 The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Subjects	Algorithms Clustering Clustering algorithms Clustering methods Data analysis Data mining Data points Dimensionality reduction Embedding Feature extraction Fuzzy c-means (FCM) Iterative methods Noise sensitivity Optimization outliers Outliers (statistics) Pattern recognition Principal component analysis principal component analysis (PCA) Principal components analysis Robustness Sparsity
Online Access	Get full text

Cover

Loading…

More Information
Summary:	The clustering method has been widely used in data mining, pattern recognition, and image identification. Fuzzy c-means (FCM) is a soft clustering method that introduces the concept of membership. In this method, the fuzzy membership matrix is obtained by calculating the distance between data points in the original space. However, these methods may yield suboptimal results owing to the influence of redundant features. Moreover, FCM is always sensitive to noise points and heavily subject to outliers. In this paper, we propose a method called sparsity FCM clustering with principal component analysis embedding (P_SFCM). We simultaneously conduct principal component analysis (PCA) and membership learning, and then add an additional weighting factor for each data point. The goal of this operation is to identify the noise or outliers. Overall, the benefit of our framework is that it retains most of the information in the subspace while improving the robustness of the noise. In this paper, we employ an iterative optimization algorithm to efficiently solve our model. To verify the reliability of the proposed method, we conduct a convergence analysis, noise robustness analysis, and multi-cluster experiments. Furthermore, comparative experiments are conducted on both synthetic and real benchmark datasets. The experimental results show that the P_SFCM is competitive with comparable methods.
Bibliography:	ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14
ISSN:	1063-6706 1941-0034
DOI:	10.1109/TFUZZ.2022.3217343