A Hybrid MapReduce Implementation of PCA on Tianhe-2

"Big Data" has been a popular word anywhere. Researchers want the data processing more efficient. PCA algorithm is an effective data reduction algorithm applied to almost all big data fields. Meanwhile, there are many Machine Learning Algorithm Library applied to provide commonly-used algo...

Full description

Saved in:
Bibliographic Details
Published inJournal of physics. Conference series Vol. 1168; no. 5; pp. 52013 - 52019
Main Authors Yu, Wei, Qu, Yili, Lu, Yutong
Format Journal Article
LanguageEnglish
Published Bristol IOP Publishing 01.02.2019
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:"Big Data" has been a popular word anywhere. Researchers want the data processing more efficient. PCA algorithm is an effective data reduction algorithm applied to almost all big data fields. Meanwhile, there are many Machine Learning Algorithm Library applied to provide commonly-used algorithm, but these algorithms do not make good use of the resources of the supercomputer system. This paper uses MapReduce Model to design and implement PCA algorithm using MPI + OpenMP + SIMD hybrid accelerator programming tools on Tianhe-2 and get a significant speedup.
ISSN:1742-6588
1742-6596
DOI:10.1088/1742-6596/1168/5/052013