An Empirical Study of HPC Workloads on Huawei Kunpeng 916 Processor

The ARM-based server processors have been gaining momentum in high performance computing (HPC). While not designed specifically for HPC, Huawei Kunpeng 916 processor has 32 ARMv8 cores and is tempting for HPC workloads. However, its potential remains unknown. To throughly understand the potential, w...

Full description

Saved in:
Bibliographic Details
Published in2019 IEEE 25th International Conference on Parallel and Distributed Systems (ICPADS) pp. 360 - 367
Main Authors Wang, Yi-Chao, Chen, Jin-Kun, Li, Bin-Rui, Zuo, Si-Cheng, Tang, William, Wang, Bei, Liao, Qiu-Cheng, Xie, Rui, Lin, James
Format Conference Proceeding
LanguageEnglish
Published IEEE 01.12.2019
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:The ARM-based server processors have been gaining momentum in high performance computing (HPC). While not designed specifically for HPC, Huawei Kunpeng 916 processor has 32 ARMv8 cores and is tempting for HPC workloads. However, its potential remains unknown. To throughly understand the potential, we conducted a systematic evaluation in three steps by using: 1) three well-known benchmarks (HPL, STREAM, and LMbench); 2) three typical scientific kernels (SpMV, N-body, and GEMM); 3) three widely used mini-apps (TeaLeaf, Neutral, and SNAP) and a real-world application GTC-P. We compared the performance results of Kunpeng 916 with that of Intel Xeon E5-2680v3/4 (Haswell/Broadwell). The evaluation results show that Kunpeng 916 has higher memory bandwidth than the two Intel processors, thus it can achieve compelling performance for running memory bound HPC applications.
DOI:10.1109/ICPADS47876.2019.00057