Sparse Clustering Algorithm Based on Multi-Domain Dimensionality Reduction Autoencoder

The key to high-dimensional clustering lies in discovering the intrinsic structures and patterns in data to provide valuable information. However, high-dimensional clustering faces enormous challenges such as dimensionality disaster, increased data sparsity, and reduced reliability of the clustering...

Full description

Saved in:
Bibliographic Details
Published inMathematics (Basel) Vol. 12; no. 10; p. 1526
Main Authors Kang, Yu, Liu, Erwei, Zou, Kaichi, Wang, Xiuyun, Zhang, Huaqing
Format Journal Article
LanguageEnglish
Published Basel MDPI AG 01.05.2024
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:The key to high-dimensional clustering lies in discovering the intrinsic structures and patterns in data to provide valuable information. However, high-dimensional clustering faces enormous challenges such as dimensionality disaster, increased data sparsity, and reduced reliability of the clustering results. In order to address these issues, we propose a sparse clustering algorithm based on a multi-domain dimensionality reduction model. This method achieves high-dimensional clustering by integrating the sparse reconstruction process and sparse L1 regularization into a deep autoencoder model. A sparse reconstruction module is designed based on the L1 sparse reconstruction of features under different domains to reconstruct the data. The proposed method mainly contributes in two aspects. Firstly, the spatial and frequency domains are combined by taking into account the spatial distribution and frequency characteristics of the data to provide multiple perspectives and choices for data analysis and processing. Then, a neural network-based clustering model with sparsity is conducted by projecting data points onto multi-domains and implementing adaptive regularization penalty terms to the weight matrix. The experimental results demonstrate superior performance of the proposed method in handling clustering problems on high-dimensional datasets.
ISSN:2227-7390
2227-7390
DOI:10.3390/math12101526