A quantitative performance analysis for Stokes solvers at the extreme scale
This article presents a systematic quantitative performance analysis for large finite element computations on extreme scale computing systems. Three parallel iterative solvers for the Stokes system, discretized by low order tetrahedral elements, are compared with respect to their numerical efficienc...
Saved in:
Main Authors | , , , , |
---|---|
Format | Journal Article |
Language | English |
Published |
06.11.2015
|
Subjects | |
Online Access | Get full text |
Cover
Loading…
Summary: | This article presents a systematic quantitative performance analysis for
large finite element computations on extreme scale computing systems. Three
parallel iterative solvers for the Stokes system, discretized by low order
tetrahedral elements, are compared with respect to their numerical efficiency
and their scalability running on up to $786\,432$ parallel threads. A genuine
multigrid method for the saddle point system using an Uzawa-type smoother
provides the best overall performance with respect to memory consumption and
time-to-solution. The largest system solved on a Blue Gene/Q system has more
than ten trillion ($1.1 \cdot 10 ^{13}$) unknowns and requires about 13 minutes
compute time. Despite the matrix free and highly optimized implementation, the
memory requirement for the solution vector and the auxiliary vectors is about
200 TByte. Brandt's notion of "textbook multigrid efficiency" is employed to
study the algorithmic performance of iterative solvers. A recent extension of
this paradigm to "parallel textbook multigrid efficiency" makes it possible to
assess also the efficiency of parallel iterative solvers for a given hardware
architecture in absolute terms. The efficiency of the method is demonstrated
for simulating incompressible fluid flow in a pipe filled with spherical
obstacles. |
---|---|
DOI: | 10.48550/arxiv.1511.02134 |