k-Means Clustering Algorithm and Its Simulation Based on Distributed Computing Platform

At present, the explosive growth of data and the mass storage state have brought many problems such as computational complexity and insufficient computational power to clustering research. The distributed computing platform through load balancing dynamically configures a large number of virtual comp...

Full description

Saved in:

Bibliographic Details
Published in	Complexity (New York, N.Y.) Vol. 2021; no. 1
Main Authors	Wu, Chunqiong, Yan, Bingwen, Yu, Rongrui, Yu, Baoqin, Zhou, Xiukao, Yu, Yanliang, Chen, Na
Format	Journal Article
Language	English
Published	Hoboken Hindawi 2021 John Wiley & Sons, Inc Wiley
Subjects	Accuracy Algorithms Big Data Cluster analysis Clustering Computer networks Data analysis Data mining Design Distributed processing Efficiency Energy consumption Explosive plating Knowledge discovery Mathematical analysis Multiprocessing Optimization Parallel processing Random sampling Stress concentration Vector quantization
Online Access	Get full text

Cover

Loading…

More Information
Summary:	At present, the explosive growth of data and the mass storage state have brought many problems such as computational complexity and insufficient computational power to clustering research. The distributed computing platform through load balancing dynamically configures a large number of virtual computing resources, effectively breaking through the bottleneck of time and energy consumption, and embodies its unique advantages in massive data mining. This paper studies the parallel k-means extensively. This article first initializes random sampling and second parallelizes the distance calculation process that provides independence between the data objects to perform cluster analysis in parallel. After the parallel processing of the MapReduce, we use many nodes to calculate distance, which speeds up the efficiency of the algorithm. Finally, the clustering of data objects is parallelized. Results show that our method can provide services efficiently and stably and have good convergence.
Bibliography:	ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14
ISSN:	1076-2787 1099-0526
DOI:	10.1155/2021/9446653