A Hybrid MapReduce Implementation of PCA on Tianhe-2
"Big Data" has been a popular word anywhere. Researchers want the data processing more efficient. PCA algorithm is an effective data reduction algorithm applied to almost all big data fields. Meanwhile, there are many Machine Learning Algorithm Library applied to provide commonly-used algo...
Saved in:
Published in | Journal of physics. Conference series Vol. 1168; no. 5; pp. 52013 - 52019 |
---|---|
Main Authors | , , |
Format | Journal Article |
Language | English |
Published |
Bristol
IOP Publishing
01.02.2019
|
Subjects | |
Online Access | Get full text |
Cover
Loading…
Summary: | "Big Data" has been a popular word anywhere. Researchers want the data processing more efficient. PCA algorithm is an effective data reduction algorithm applied to almost all big data fields. Meanwhile, there are many Machine Learning Algorithm Library applied to provide commonly-used algorithm, but these algorithms do not make good use of the resources of the supercomputer system. This paper uses MapReduce Model to design and implement PCA algorithm using MPI + OpenMP + SIMD hybrid accelerator programming tools on Tianhe-2 and get a significant speedup. |
---|---|
ISSN: | 1742-6588 1742-6596 |
DOI: | 10.1088/1742-6596/1168/5/052013 |