Performance Portability Strategies for Grid C++ Expression Templates

One of the key requirements for the Lattice QCD Application Development as part of the US Exascale Computing Project is performance portability across multiple architectures. Using the Grid C ++ expression template as a starting point, we report on the progress made with regards to the Grid GPU offl...

Full description

Saved in:

Bibliographic Details
Published in	EPJ Web of conferences Vol. 175; p. 9006
Main Authors	Boyle, Peter A., Clark, M.A., DeTar, Carleton, Lin, Meifeng, Rana, Verinder, Vaquero Avilés-Casco, Alejandro
Format	Journal Article Conference Proceeding
Language	English
Published	Les Ulis EDP Sciences 01.01.2018
Subjects	C++ (programming language) Experimentation Portability Quantum chromodynamics
Online Access	Get full text

Cover

Loading…

More Information
Summary:	One of the key requirements for the Lattice QCD Application Development as part of the US Exascale Computing Project is performance portability across multiple architectures. Using the Grid C ++ expression template as a starting point, we report on the progress made with regards to the Grid GPU offloading strategies. We present both the successes and issues encountered in using CUDA, OpenACC and Just-In-Time compilation. Experimentation and performance on GPUs with a SU(3)×SU(3) streaming test will be reported. We will also report on the challenges of using current OpenMP 4.x for GPU offloading in the same code.
Bibliography:	ObjectType-Conference Proceeding-1 SourceType-Conference Papers & Proceedings-1 content type line 21
ISSN:	2100-014X 2101-6275 2100-014X
DOI:	10.1051/epjconf/201817509006