GPU-RRTMG_SW: Accelerating a Shortwave Radiative Transfer Scheme on GPU

As high performance computing technology continues to develop rapidly, graphics processing units (GPUs) are more widely used for daily computing tasks. However, some applications, such as weather forecasting, require large-scale computation. Thus, due to the computationally intensive characteristics...

Full description

Saved in:
Bibliographic Details
Published inIEEE access Vol. 9; pp. 84231 - 84240
Main Authors Wang, Zhenzhen, Wang, Yuzhu, Wang, Xiaocong, Li, Fei, Zhou, Chen, Hu, Hangtian, Jiang, Jinrong
Format Journal Article
LanguageEnglish
Published Piscataway The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2021
IEEE
Subjects
Online AccessGet full text

Cover

Loading…
Abstract As high performance computing technology continues to develop rapidly, graphics processing units (GPUs) are more widely used for daily computing tasks. However, some applications, such as weather forecasting, require large-scale computation. Thus, due to the computationally intensive characteristics of the rapid radiative transfer model for general circulation models (RRTMG) in the Earth system model, this study uses GPU-related technology to accelerate computation of the model. First, two kinds of algorithms using GPU technology to accelerate the RRTMG shortwave radiation scheme (RRTMG_SW) are proposed. Then, an optimization method for data transmission between host and device is proposed. Finally, after using CUDA Fortran and CUDA C to implement these algorithms, two GPU versions of RRTMG_SW, namely CF-RRTMG_SW (CUDA Fortran version) and CC-RRTMG_SW (CUDA C version), were developed. The experimental results demonstrate that the proposed acceleration algorithms are effective. Without I/O transfer, running CC-RRTMG_SW on a NVIDIA GeForce Titan V GPU achieved a [Formula Omitted] speedup when compared to a single Intel Xeon E5-2680 CPU core.
AbstractList As high performance computing technology continues to develop rapidly, graphics processing units (GPUs) are more widely used for daily computing tasks. However, some applications, such as weather forecasting, require large-scale computation. Thus, due to the computationally intensive characteristics of the rapid radiative transfer model for general circulation models (RRTMG) in the Earth system model, this study uses GPU-related technology to accelerate computation of the model. First, two kinds of algorithms using GPU technology to accelerate the RRTMG shortwave radiation scheme (RRTMG_SW) are proposed. Then, an optimization method for data transmission between host and device is proposed. Finally, after using CUDA Fortran and CUDA C to implement these algorithms, two GPU versions of RRTMG_SW, namely CF-RRTMG_SW (CUDA Fortran version) and CC-RRTMG_SW (CUDA C version), were developed. The experimental results demonstrate that the proposed acceleration algorithms are effective. Without I/O transfer, running CC-RRTMG_SW on a NVIDIA GeForce Titan V GPU achieved a <tex-math notation="LaTeX">$38.88\times $ </tex-math> speedup when compared to a single Intel Xeon E5-2680 CPU core.
As high performance computing technology continues to develop rapidly, graphics processing units (GPUs) are more widely used for daily computing tasks. However, some applications, such as weather forecasting, require large-scale computation. Thus, due to the computationally intensive characteristics of the rapid radiative transfer model for general circulation models (RRTMG) in the Earth system model, this study uses GPU-related technology to accelerate computation of the model. First, two kinds of algorithms using GPU technology to accelerate the RRTMG shortwave radiation scheme (RRTMG_SW) are proposed. Then, an optimization method for data transmission between host and device is proposed. Finally, after using CUDA Fortran and CUDA C to implement these algorithms, two GPU versions of RRTMG_SW, namely CF-RRTMG_SW (CUDA Fortran version) and CC-RRTMG_SW (CUDA C version), were developed. The experimental results demonstrate that the proposed acceleration algorithms are effective. Without I/O transfer, running CC-RRTMG_SW on a NVIDIA GeForce Titan V GPU achieved a [Formula Omitted] speedup when compared to a single Intel Xeon E5-2680 CPU core.
Author Wang, Yuzhu
Zhou, Chen
Hu, Hangtian
Jiang, Jinrong
Wang, Zhenzhen
Wang, Xiaocong
Li, Fei
Author_xml – sequence: 1
  givenname: Zhenzhen
  surname: Wang
  fullname: Wang, Zhenzhen
– sequence: 2
  givenname: Yuzhu
  orcidid: 0000-0003-0449-2973
  surname: Wang
  fullname: Wang, Yuzhu
– sequence: 3
  givenname: Xiaocong
  surname: Wang
  fullname: Wang, Xiaocong
– sequence: 4
  givenname: Fei
  surname: Li
  fullname: Li, Fei
– sequence: 5
  givenname: Chen
  surname: Zhou
  fullname: Zhou, Chen
– sequence: 6
  givenname: Hangtian
  orcidid: 0000-0002-6985-4042
  surname: Hu
  fullname: Hu, Hangtian
– sequence: 7
  givenname: Jinrong
  orcidid: 0000-0003-4463-8666
  surname: Jiang
  fullname: Jiang, Jinrong
BookMark eNpNkU9PAjEQxRuDiYh8Ai-beF7s_229EYJIgtGwEI9Nt8zCEthid9H47S1ijHOZyZuX30zyrlGn9jUgdEvwgBCs74ej0TjPBxRTMmBYZQJnF6hLidQpE0x2_s1XqN80WxxLRUlkXTSZvC7T-XzxPDH520MydA52EGxb1evEJvnGh_bTfkAyt6sqqnFaBFs3JYQkdxvYQ-LrJDJu0GVpdw30f3sPLR_Hi9FTOnuZTEfDWeqYUm2arUDwggnLy0IxpojjQmnOMClkaWkBlkeL1pgRAKl4pjklmjmcldLFFeuh6Zm78nZrDqHa2_BlvK3Mj-DD2tjQVm4HpmQawEmOMbGcArdFAaSkQlPFsRA4su7OrEPw70doWrP1x1DH9w0VnHAppRLRxc4uF3zTBCj_rhJsTgGYcwDmFID5DYB9A6fAdws
CitedBy_id crossref_primary_10_1016_j_future_2023_04_021
crossref_primary_10_1007_s11227_023_05360_7
crossref_primary_10_5194_gmd_16_4367_2023
Cites_doi 10.1002/cpe.3822
10.1007/s10586-017-1477-0
10.3390/app9194039
10.1109/JSTARS.2015.2427652
10.1007/s11432-015-0594-2
10.1175/AMSMONOGRAPHS-D-15-0041.1
10.1175/2008MWR2590.1
10.1109/JSTARS.2012.2186119
10.1007/s11227-011-0675-4
10.1175/MWR-D-11-00367.1
10.1007/s11227-020-03451-3
10.1007/s11704-019-8184-3
10.1016/j.cageo.2011.08.007
10.1016/j.future.2017.02.008
10.3390/app10020649
10.1007/s13174-012-0071-1
10.1109/MM.2010.41
10.1007/s11227-018-2406-6
10.1109/JSTARS.2014.2315771
10.1109/TC.2013.2295806
10.1117/12.2031450
10.2172/1172166
10.1109/ISPAW.2011.38
10.1029/2020MS002210
10.1016/j.jqsrt.2004.05.058
10.1080/16742834.2014.11447144
10.1016/j.future.2017.02.012
ContentType Journal Article
Copyright Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2021
Copyright_xml – notice: Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2021
DBID AAYXX
CITATION
7SC
7SP
7SR
8BQ
8FD
JG9
JQ2
L7M
L~C
L~D
DOA
DOI 10.1109/ACCESS.2021.3087507
DatabaseName CrossRef
Computer and Information Systems Abstracts
Electronics & Communications Abstracts
Engineered Materials Abstracts
METADEX
Technology Research Database
Materials Research Database
ProQuest Computer Science Collection
Advanced Technologies Database with Aerospace
Computer and Information Systems Abstracts – Academic
Computer and Information Systems Abstracts Professional
DOAJ Directory of Open Access Journals
DatabaseTitle CrossRef
Materials Research Database
Engineered Materials Abstracts
Technology Research Database
Computer and Information Systems Abstracts – Academic
Electronics & Communications Abstracts
ProQuest Computer Science Collection
Computer and Information Systems Abstracts
Advanced Technologies Database with Aerospace
METADEX
Computer and Information Systems Abstracts Professional
DatabaseTitleList
Materials Research Database
Database_xml – sequence: 1
  dbid: DOA
  name: DOAJ Directory of Open Access Journals
  url: https://www.doaj.org/
  sourceTypes: Open Website
DeliveryMethod fulltext_linktorsrc
Discipline Engineering
EISSN 2169-3536
EndPage 84240
ExternalDocumentID oai_doaj_org_article_f39eec64001a42e4abbe1f2592840550
10_1109_ACCESS_2021_3087507
GroupedDBID 0R~
4.4
5VS
6IK
97E
AAJGR
AAYXX
ACGFS
ADBBV
ALMA_UNASSIGNED_HOLDINGS
BCNDV
BEFXN
BFFAM
BGNUA
BKEBE
BPEOZ
CITATION
EBS
EJD
ESBDL
GROUPED_DOAJ
IFIPE
IPLJI
JAVBF
KQ8
M43
M~E
O9-
OCL
OK1
RIE
RIG
RNS
7SC
7SP
7SR
8BQ
8FD
JG9
JQ2
L7M
L~C
L~D
ID FETCH-LOGICAL-c388t-7de54b35a4fb83381c45894301b6fa2bea47de99031ee6847942193c07f6c47d3
IEDL.DBID DOA
ISSN 2169-3536
IngestDate Tue Oct 22 15:13:05 EDT 2024
Thu Oct 10 16:46:30 EDT 2024
Fri Aug 23 02:46:32 EDT 2024
IsDoiOpenAccess true
IsOpenAccess true
IsPeerReviewed true
IsScholarly true
Language English
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-c388t-7de54b35a4fb83381c45894301b6fa2bea47de99031ee6847942193c07f6c47d3
ORCID 0000-0003-0449-2973
0000-0002-6985-4042
0000-0003-4463-8666
OpenAccessLink https://doaj.org/article/f39eec64001a42e4abbe1f2592840550
PQID 2541466685
PQPubID 4845423
PageCount 10
ParticipantIDs doaj_primary_oai_doaj_org_article_f39eec64001a42e4abbe1f2592840550
proquest_journals_2541466685
crossref_primary_10_1109_ACCESS_2021_3087507
PublicationCentury 2000
PublicationDate 2021-00-00
20210101
2021-01-01
PublicationDateYYYYMMDD 2021-01-01
PublicationDate_xml – year: 2021
  text: 2021-00-00
PublicationDecade 2020
PublicationPlace Piscataway
PublicationPlace_xml – name: Piscataway
PublicationTitle IEEE access
PublicationYear 2021
Publisher The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
IEEE
Publisher_xml – name: The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
– name: IEEE
References ref13
ref12
ref15
ref14
ref10
ref2
ref1
ref17
ref16
ref19
ref18
(ref9) 2019
(ref11) 2019
ref24
ref23
ref26
ref25
ref20
ref22
ref21
ref28
ref27
ref29
ref8
ref7
ref4
ref3
ref6
ref5
ruetsch (ref30) 2013
References_xml – ident: ref13
  doi: 10.1002/cpe.3822
– ident: ref15
  doi: 10.1007/s10586-017-1477-0
– ident: ref22
  doi: 10.3390/app9194039
– ident: ref10
  doi: 10.1109/JSTARS.2015.2427652
– ident: ref27
  doi: 10.1007/s11432-015-0594-2
– ident: ref28
  doi: 10.1175/AMSMONOGRAPHS-D-15-0041.1
– ident: ref3
  doi: 10.1175/2008MWR2590.1
– ident: ref25
  doi: 10.1109/JSTARS.2012.2186119
– ident: ref6
  doi: 10.1007/s11227-011-0675-4
– ident: ref16
  doi: 10.1175/MWR-D-11-00367.1
– ident: ref24
  doi: 10.1007/s11227-020-03451-3
– ident: ref26
  doi: 10.1007/s11704-019-8184-3
– ident: ref19
  doi: 10.1016/j.cageo.2011.08.007
– ident: ref17
  doi: 10.1016/j.future.2017.02.008
– ident: ref23
  doi: 10.3390/app10020649
– ident: ref5
  doi: 10.1007/s13174-012-0071-1
– ident: ref4
  doi: 10.1109/MM.2010.41
– ident: ref14
  doi: 10.1007/s11227-018-2406-6
– ident: ref21
  doi: 10.1109/JSTARS.2014.2315771
– ident: ref7
  doi: 10.1109/TC.2013.2295806
– ident: ref20
  doi: 10.1117/12.2031450
– ident: ref2
  doi: 10.2172/1172166
– ident: ref18
  doi: 10.1109/ISPAW.2011.38
– year: 2019
  ident: ref11
  publication-title: CUDA Fortran programming guide and reference
– year: 2019
  ident: ref9
  publication-title: CUDA C Programming Guide v10 0
– year: 2013
  ident: ref30
  publication-title: CUDA Fortran for Scientists and Engineers Best Practices for Efficient CUDA Fortran Programming
  contributor:
    fullname: ruetsch
– ident: ref29
  doi: 10.1029/2020MS002210
– ident: ref1
  doi: 10.1016/j.jqsrt.2004.05.058
– ident: ref12
  doi: 10.1080/16742834.2014.11447144
– ident: ref8
  doi: 10.1016/j.future.2017.02.012
SSID ssj0000816957
Score 2.260475
Snippet As high performance computing technology continues to develop rapidly, graphics processing units (GPUs) are more widely used for daily computing tasks....
SourceID doaj
proquest
crossref
SourceType Open Website
Aggregation Database
StartPage 84231
SubjectTerms Algorithms
Computation
compute unified device architecture
Data transmission
FORTRAN
General circulation models
graphics processing unit
Graphics processing units
High performance computing
Optimization
Radiative transfer
Short wave radiation
shortwave radiative transfer
Weather forecasting
Title GPU-RRTMG_SW: Accelerating a Shortwave Radiative Transfer Scheme on GPU
URI https://www.proquest.com/docview/2541466685
https://doaj.org/article/f39eec64001a42e4abbe1f2592840550
Volume 9
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
link http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwrV09T8MwELVQJxgQn6JQkAdGQuPGdhy2UtFWSEWoH6KbZbsXWEhRKfD3OTspisTAwhpbifPOufcust4RcslQ5cZIuxGqX4iQIWxkTcaj1Ap_ClCmJnRrGD3I4Yzfz8W81urLnwkr7YFL4Np5kgE4iVuNGd4BbqwFlqNox7wai6paj7NaMRVysGIyE2llM4Tj7W6vh2-EBWGHXXsXPOEbyNaoKDj2_0rIgWX6e2S3koe0Wy5rn2xBcUB2aqaBh2QweJxF4_F0NNCTpxvadQ6Zw8exeKaGTl5QT3-ZT6Bj7zrgkxkNfJTDik4wQq9AlwXFexyRWf9u2htGVTuEyCVKraN0AYLbRBieW4WVJXNcBPd0ZmVuOhYMxynILgkDkMpbx2M6Slyc5tLhUHJMGsWygBNC2cJJKYwCB4oDsraSiQVuU68WFmnWJFcbZPRb6XqhQ7UQZ7oEUnsgdQVkk9x69H6mesvqcAEDqatA6r8C2SStDfa6-o7edcd3KccKS4nT_3jGGdn26y5_obRIY736gHMUFWt7EfbPN5wGw5A
link.rule.ids 315,783,787,867,2109,4031,27935,27936,27937
linkProvider Directory of Open Access Journals
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=GPU-RRTMG_SW%3A+Accelerating+a+Shortwave+Radiative+Transfer+Scheme+on+GPU&rft.jtitle=IEEE+access&rft.au=Wang%2C+Zhenzhen&rft.au=Wang%2C+Yuzhu&rft.au=Wang%2C+Xiaocong&rft.au=Li%2C+Fei&rft.date=2021&rft.issn=2169-3536&rft.eissn=2169-3536&rft.volume=9&rft.spage=84231&rft.epage=84240&rft_id=info:doi/10.1109%2FACCESS.2021.3087507&rft.externalDBID=n%2Fa&rft.externalDocID=10_1109_ACCESS_2021_3087507
thumbnail_l http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=2169-3536&client=summon
thumbnail_m http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=2169-3536&client=summon
thumbnail_s http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=2169-3536&client=summon