The Impact of Enhancing the k-Means Algorithm through Genetic Algorithm Optimization on High Dimensional Data Clustering Outcomes
In the realm of unsupervised machine learning, the k-Means algorithm stands as a cornerstone for clustering high-dimensional data. However, its efficiency and accuracy can significantly dwindle as the dimensionality of the dataset increases. This paper introduces an innovative approach that integrat...
Saved in:
Published in | 2024 International Conference on Knowledge Engineering and Communication Systems (ICKECS) Vol. 1; pp. 1 - 5 |
---|---|
Main Authors | , , , , , |
Format | Conference Proceeding |
Language | English |
Published |
IEEE
18.04.2024
|
Subjects | |
Online Access | Get full text |
Cover
Loading…
Summary: | In the realm of unsupervised machine learning, the k-Means algorithm stands as a cornerstone for clustering high-dimensional data. However, its efficiency and accuracy can significantly dwindle as the dimensionality of the dataset increases. This paper introduces an innovative approach that integrates Genetic Algorithm (GA) optimization with the k-Means clustering process, aiming to enhance its performance on high-dimensional datasets. The proposed methodology leverages the evolutionary capabilities of Genetic Algorithms to optimize the initial centroid selection and clustering configuration of k-Means, thus addressing its sensitivity to initial conditions and potential to converge to local optima. We systematically detail the implementation of this hybrid algorithm, incorporating mathematical expressions to elucidate the optimization process. Through extensive experiments on synthetic and real-world datasets, we demonstrate the superior clustering outcomes of the enhanced k-Means algorithm in terms of accuracy, robustness, and computational efficiency. The results are presented via a series of graphs and tables, providing a comparative analysis against traditional clustering algorithms. Our findings indicate that the integration of GA into the k-Means algorithm significantly improves its performance on high-dimensional data, making it a powerful tool for data mining applications where conventional clustering methods fall short. This study not only proposes a novel clustering solution but also contributes to the ongoing discourse on optimizing machine learning algorithms for complex datasets. |
---|---|
DOI: | 10.1109/ICKECS61492.2024.10617268 |