CUDA-Based Parallelization of Power Iteration Clustering for Large Datasets

This paper presents a new clustering algorithm, the GPIC, a graphics processing unit (GPU) accelerated algorithm for power iteration clustering (PIC). Our algorithm is based on the original PIC proposal, adapted to take advantage of the GPU architecture, maintaining the algorithm's original pro...

Full description

Saved in:
Bibliographic Details
Published inIEEE access Vol. 5; pp. 27263 - 27271
Main Authors Rodrigues Lacerda Silva, Gustavo, Ribeiro De Medeiros, Rafael, Acevedo Jaimes, Brayan Rene, Caldeira Takahashi, Carla, Gomes Vieira, Douglas Alexandre, De PáDua Braga, AntôNio
Format Journal Article
LanguageEnglish
Published Piscataway IEEE 01.01.2017
The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:This paper presents a new clustering algorithm, the GPIC, a graphics processing unit (GPU) accelerated algorithm for power iteration clustering (PIC). Our algorithm is based on the original PIC proposal, adapted to take advantage of the GPU architecture, maintaining the algorithm's original properties. The proposed method was compared against the serial implementation, achieving a considerable speedup in tests with synthetic and real data sets. A significant volume of real data application (>107 records) was used, and we identified that GPIC implementation has good scalability to handle data sets with millions of data points. Our implementation efforts are directed towards two aspects: to process large data sets in less time and to maintain the same quality of the clusters results generated by the original PIC version.
ISSN:2169-3536
2169-3536
DOI:10.1109/ACCESS.2017.2765380