Overcoming GPU Memory Capacity Limitations in Hybrid MPI Implementations of CFD

In this paper, we describe a hybrid MPI implementation of a discontinuous Galerkin scheme in Computational Fluid Dynamics which can utilize all the available processing units (CPU cores or GPU devices) on each computational node. We describe the optimization techniques used in our GPU implementation...

Full description

Saved in:
Bibliographic Details
Published inInternet and Distributed Computing Systems Vol. 11874; pp. 100 - 111
Main Authors Choi, Jake, Kim, Yoonhee, Yeom, Heon-young
Format Book Chapter
LanguageEnglish
Published Switzerland Springer International Publishing AG 2019
Springer International Publishing
SeriesLecture Notes in Computer Science
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:In this paper, we describe a hybrid MPI implementation of a discontinuous Galerkin scheme in Computational Fluid Dynamics which can utilize all the available processing units (CPU cores or GPU devices) on each computational node. We describe the optimization techniques used in our GPU implementation making it up to 74.88x faster than the single core CPU implementation in our machine environment. We also perform experiments on work partitioning between heterogeneous devices to measure the ideal load balance achieving the optimal performance in a single node consisting of heterogeneous processing units. The key problem is that CFD workloads need to allocate large amounts of both host and GPU device memory in order to compute accurate results. There exists an economic burden, not to mention additional communication overheads of simply scaling out by adding more nodes with high-end scientific GPU devices. In a micro-management perspective, workload size in each single node is also limited by its attached GPU memory capacity. To overcome this, we use ZFP, a floating-point compression algorithm to save at least 25% of data usage in our workloads, with less performance degradation than using NVIDIA UM.
Bibliography:This research was supported by Next-Generation Information Computing Development Program through the National Research Foundation of Korea (NRF) funded by the Ministry of Science, ICT (2015M3C4A7065646).
ISBN:9783030349134
3030349136
ISSN:0302-9743
1611-3349
DOI:10.1007/978-3-030-34914-1_10