Automatically Optimizing Stencil Computations on Many-Core NUMA Architectures

This paper presents a system for automatically supporting the optimization of stencil kernels on emerging Non-Uniform Memory Access (NUMA) many-core architectures, through a combined compiler + runtime approach. In particular, we use a pragma-driven compiler to recognize the special structures and o...

Full description

Saved in:

Bibliographic Details
Published in	Languages and Compilers for Parallel Computing Vol. 10136; pp. 137 - 152
Main Authors	Lin, Pei-Hung, Yi, Qing, Quinlan, Daniel, Liao, Chunhua, Yan, Yongqing
Format	Book Chapter
Language	English
Published	Switzerland Springer International Publishing AG 2017 Springer International Publishing
Series	Lecture Notes in Computer Science
Subjects	Data Placement Halo Region NUMA Node Runtime Library Stencil Computation
Online Access	Get full text

Cover

Loading…

More Information
Summary:	This paper presents a system for automatically supporting the optimization of stencil kernels on emerging Non-Uniform Memory Access (NUMA) many-core architectures, through a combined compiler + runtime approach. In particular, we use a pragma-driven compiler to recognize the special structures and optimization needs of stencil computations and thereby to automatically generate low-level code that efficiently utilize the data placement and management support of a C++ runtime on top of NUMA API, a programming interface to the NUMA policy supported by the Linux kernel. Our results show that through automated specialization of code generation, this approach provides a combined benefit of performance, portability, and productivity for developers.
ISBN:	3319527088 9783319527086
ISSN:	0302-9743 1611-3349
DOI:	10.1007/978-3-319-52709-3_12