Optimising the performance of the spectral/hp element method with collective linear algebra operations

As computing hardware evolves, increasing core counts mean that memory bandwidth is becoming the deciding factor in attaining peak performance of numerical methods. High-order finite element methods, such as those implemented in the spectral/hp framework Nektar++, are particularly well-suited to thi...

Full description

Saved in:
Bibliographic Details
Published inComputer methods in applied mechanics and engineering Vol. 310; pp. 628 - 645
Main Authors Moxey, D., Cantwell, C.D., Kirby, R.M., Sherwin, S.J.
Format Journal Article
LanguageEnglish
Published Elsevier B.V 01.10.2016
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:As computing hardware evolves, increasing core counts mean that memory bandwidth is becoming the deciding factor in attaining peak performance of numerical methods. High-order finite element methods, such as those implemented in the spectral/hp framework Nektar++, are particularly well-suited to this environment. Unlike low-order methods that typically utilise sparse storage, matrices representing high-order operators have greater density and richer structure. In this paper, we show how these qualities can be exploited to increase runtime performance on nodes that comprise a typical high-performance computing system, by amalgamating the action of key operators on multiple elements into a single, memory-efficient block. We investigate different strategies for achieving optimal performance across a range of polynomial orders and element types. As these strategies all depend on external factors such as BLAS implementation and the geometry of interest, we present a technique for automatically selecting the most efficient strategy at runtime.
ISSN:0045-7825
1879-2138
DOI:10.1016/j.cma.2016.07.001