The Impact of Enhancing the k-Means Algorithm through Genetic Algorithm Optimization on High Dimensional Data Clustering Outcomes

In the realm of unsupervised machine learning, the k-Means algorithm stands as a cornerstone for clustering high-dimensional data. However, its efficiency and accuracy can significantly dwindle as the dimensionality of the dataset increases. This paper introduces an innovative approach that integrat...

Full description

Saved in:
Bibliographic Details
Published in2024 International Conference on Knowledge Engineering and Communication Systems (ICKECS) Vol. 1; pp. 1 - 5
Main Authors Raman, Ramakrishnan, Kumar, Vikram, Pillai, Biju G., Rabadiya, Dhaval, Patre, Smruti, Meenakshi, R.
Format Conference Proceeding
LanguageEnglish
Published IEEE 18.04.2024
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:In the realm of unsupervised machine learning, the k-Means algorithm stands as a cornerstone for clustering high-dimensional data. However, its efficiency and accuracy can significantly dwindle as the dimensionality of the dataset increases. This paper introduces an innovative approach that integrates Genetic Algorithm (GA) optimization with the k-Means clustering process, aiming to enhance its performance on high-dimensional datasets. The proposed methodology leverages the evolutionary capabilities of Genetic Algorithms to optimize the initial centroid selection and clustering configuration of k-Means, thus addressing its sensitivity to initial conditions and potential to converge to local optima. We systematically detail the implementation of this hybrid algorithm, incorporating mathematical expressions to elucidate the optimization process. Through extensive experiments on synthetic and real-world datasets, we demonstrate the superior clustering outcomes of the enhanced k-Means algorithm in terms of accuracy, robustness, and computational efficiency. The results are presented via a series of graphs and tables, providing a comparative analysis against traditional clustering algorithms. Our findings indicate that the integration of GA into the k-Means algorithm significantly improves its performance on high-dimensional data, making it a powerful tool for data mining applications where conventional clustering methods fall short. This study not only proposes a novel clustering solution but also contributes to the ongoing discourse on optimizing machine learning algorithms for complex datasets.
DOI:10.1109/ICKECS61492.2024.10617268