The Impact of Enhancing the k-Means Algorithm through Genetic Algorithm Optimization on High Dimensional Data Clustering Outcomes

In the realm of unsupervised machine learning, the k-Means algorithm stands as a cornerstone for clustering high-dimensional data. However, its efficiency and accuracy can significantly dwindle as the dimensionality of the dataset increases. This paper introduces an innovative approach that integrat...

Full description

Saved in:

Bibliographic Details
Published in	2024 International Conference on Knowledge Engineering and Communication Systems (ICKECS) Vol. 1; pp. 1 - 5
Main Authors	Raman, Ramakrishnan, Kumar, Vikram, Pillai, Biju G., Rabadiya, Dhaval, Patre, Smruti, Meenakshi, R.
Format	Conference Proceeding
Language	English
Published	IEEE 18.04.2024
Subjects	Accuracy Clustering algorithms Evolutionary Computing Genetic Algorithm Optimization High-Dimensional Data Clustering k-Means Algorithm Knowledge engineering Machine learning algorithms Navigation Robustness Sensitivity Unsupervised Machine Learning
Online Access	Get full text

Cover

Loading…

More Information
Summary:	In the realm of unsupervised machine learning, the k-Means algorithm stands as a cornerstone for clustering high-dimensional data. However, its efficiency and accuracy can significantly dwindle as the dimensionality of the dataset increases. This paper introduces an innovative approach that integrates Genetic Algorithm (GA) optimization with the k-Means clustering process, aiming to enhance its performance on high-dimensional datasets. The proposed methodology leverages the evolutionary capabilities of Genetic Algorithms to optimize the initial centroid selection and clustering configuration of k-Means, thus addressing its sensitivity to initial conditions and potential to converge to local optima. We systematically detail the implementation of this hybrid algorithm, incorporating mathematical expressions to elucidate the optimization process. Through extensive experiments on synthetic and real-world datasets, we demonstrate the superior clustering outcomes of the enhanced k-Means algorithm in terms of accuracy, robustness, and computational efficiency. The results are presented via a series of graphs and tables, providing a comparative analysis against traditional clustering algorithms. Our findings indicate that the integration of GA into the k-Means algorithm significantly improves its performance on high-dimensional data, making it a powerful tool for data mining applications where conventional clustering methods fall short. This study not only proposes a novel clustering solution but also contributes to the ongoing discourse on optimizing machine learning algorithms for complex datasets.
DOI:	10.1109/ICKECS61492.2024.10617268