The GPU version of LASG/IAP Climate System Ocean Model version 3 framework and its large-scale application

A high-resolution (1/20") global ocean general circulation model with graphics processing unit (GPU) code implementations is developed based on the LASG/IAP Climate System Ocean Model version 3 (LICOM3) under a heterogeneous-compute interface for portability (HIP) framework. The dynamic core an...

Full description

Saved in:
Bibliographic Details
Published inGeoscientific model development Vol. 14; no. 5; pp. 2781 - 5561
Main Authors Wang, Pengfei, Jiang, Jinrong, Lin, Pengfei, Ding, Mengrong, Wei, Junlin, Zhang, Feng, Zhao, Lian, Li, Yiwen, Yu, Zipeng, Zheng, Weipeng, Yu, Yongqiang, Chi, Xuebin, Liu, Hailong
Format Journal Article
LanguageEnglish
Published Copernicus GmbH 18.05.2021
Online AccessGet full text

Cover

Loading…
More Information
Summary:A high-resolution (1/20") global ocean general circulation model with graphics processing unit (GPU) code implementations is developed based on the LASG/IAP Climate System Ocean Model version 3 (LICOM3) under a heterogeneous-compute interface for portability (HIP) framework. The dynamic core and physics package of LICOM3 are both ported to the GPU, and three-dimensional parallelization (also partitioned in the vertical direction) is applied. The HIP version of LICOM3 (LICOM3-HIP) is 42 times faster than the same number of CPU cores when 384 AMD GPUs and CPU cores are used. LICOM3-HIP has excellent scalability; it can still obtain a speedup of more than 4 on 9216 GPUs compared to 384 GPUs. In this phase, we successfully performed a test of 1/20" LICOM3-HIP using 6550 nodes and 26 200 GPUs, and on a large scale, the model's speed was increased to approximately 2.72 simulated years per day (SYPD). By putting almost all the computation processes inside GPUs, the time cost of data transfer between CPUs and GPUs was reduced, resulting in high performance. Simultaneously, a 14-year spin-up integration following phase 2 of the Ocean Model Intercomparison Project (OMIP-2) protocol of surface forcing was performed, and preliminary results were evaluated. We found that the model results had little difference from the CPU version. Further comparison with observations and lower-resolution LICOM3 results suggests that the 1/20" LICOM3-HIP can reproduce the observations and produce many smaller-scale activities, such as submesoscale eddies and frontal-scale structures.
ISSN:1991-959X
1991-9603