GPU-RRTMG_SW: Accelerating a Shortwave Radiative Transfer Scheme on GPU
As high performance computing technology continues to develop rapidly, graphics processing units (GPUs) are more widely used for daily computing tasks. However, some applications, such as weather forecasting, require large-scale computation. Thus, due to the computationally intensive characteristics...
Saved in:
Published in | IEEE access Vol. 9; pp. 84231 - 84240 |
---|---|
Main Authors | , , , , , , |
Format | Journal Article |
Language | English |
Published |
Piscataway
The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
2021
IEEE |
Subjects | |
Online Access | Get full text |
Cover
Loading…
Abstract | As high performance computing technology continues to develop rapidly, graphics processing units (GPUs) are more widely used for daily computing tasks. However, some applications, such as weather forecasting, require large-scale computation. Thus, due to the computationally intensive characteristics of the rapid radiative transfer model for general circulation models (RRTMG) in the Earth system model, this study uses GPU-related technology to accelerate computation of the model. First, two kinds of algorithms using GPU technology to accelerate the RRTMG shortwave radiation scheme (RRTMG_SW) are proposed. Then, an optimization method for data transmission between host and device is proposed. Finally, after using CUDA Fortran and CUDA C to implement these algorithms, two GPU versions of RRTMG_SW, namely CF-RRTMG_SW (CUDA Fortran version) and CC-RRTMG_SW (CUDA C version), were developed. The experimental results demonstrate that the proposed acceleration algorithms are effective. Without I/O transfer, running CC-RRTMG_SW on a NVIDIA GeForce Titan V GPU achieved a [Formula Omitted] speedup when compared to a single Intel Xeon E5-2680 CPU core. |
---|---|
AbstractList | As high performance computing technology continues to develop rapidly, graphics processing units (GPUs) are more widely used for daily computing tasks. However, some applications, such as weather forecasting, require large-scale computation. Thus, due to the computationally intensive characteristics of the rapid radiative transfer model for general circulation models (RRTMG) in the Earth system model, this study uses GPU-related technology to accelerate computation of the model. First, two kinds of algorithms using GPU technology to accelerate the RRTMG shortwave radiation scheme (RRTMG_SW) are proposed. Then, an optimization method for data transmission between host and device is proposed. Finally, after using CUDA Fortran and CUDA C to implement these algorithms, two GPU versions of RRTMG_SW, namely CF-RRTMG_SW (CUDA Fortran version) and CC-RRTMG_SW (CUDA C version), were developed. The experimental results demonstrate that the proposed acceleration algorithms are effective. Without I/O transfer, running CC-RRTMG_SW on a NVIDIA GeForce Titan V GPU achieved a <tex-math notation="LaTeX">$38.88\times $ </tex-math> speedup when compared to a single Intel Xeon E5-2680 CPU core. As high performance computing technology continues to develop rapidly, graphics processing units (GPUs) are more widely used for daily computing tasks. However, some applications, such as weather forecasting, require large-scale computation. Thus, due to the computationally intensive characteristics of the rapid radiative transfer model for general circulation models (RRTMG) in the Earth system model, this study uses GPU-related technology to accelerate computation of the model. First, two kinds of algorithms using GPU technology to accelerate the RRTMG shortwave radiation scheme (RRTMG_SW) are proposed. Then, an optimization method for data transmission between host and device is proposed. Finally, after using CUDA Fortran and CUDA C to implement these algorithms, two GPU versions of RRTMG_SW, namely CF-RRTMG_SW (CUDA Fortran version) and CC-RRTMG_SW (CUDA C version), were developed. The experimental results demonstrate that the proposed acceleration algorithms are effective. Without I/O transfer, running CC-RRTMG_SW on a NVIDIA GeForce Titan V GPU achieved a [Formula Omitted] speedup when compared to a single Intel Xeon E5-2680 CPU core. |
Author | Wang, Yuzhu Zhou, Chen Hu, Hangtian Jiang, Jinrong Wang, Zhenzhen Wang, Xiaocong Li, Fei |
Author_xml | – sequence: 1 givenname: Zhenzhen surname: Wang fullname: Wang, Zhenzhen – sequence: 2 givenname: Yuzhu orcidid: 0000-0003-0449-2973 surname: Wang fullname: Wang, Yuzhu – sequence: 3 givenname: Xiaocong surname: Wang fullname: Wang, Xiaocong – sequence: 4 givenname: Fei surname: Li fullname: Li, Fei – sequence: 5 givenname: Chen surname: Zhou fullname: Zhou, Chen – sequence: 6 givenname: Hangtian orcidid: 0000-0002-6985-4042 surname: Hu fullname: Hu, Hangtian – sequence: 7 givenname: Jinrong orcidid: 0000-0003-4463-8666 surname: Jiang fullname: Jiang, Jinrong |
BookMark | eNpNkU9PAjEQxRuDiYh8Ai-beF7s_229EYJIgtGwEI9Nt8zCEthid9H47S1ijHOZyZuX30zyrlGn9jUgdEvwgBCs74ej0TjPBxRTMmBYZQJnF6hLidQpE0x2_s1XqN80WxxLRUlkXTSZvC7T-XzxPDH520MydA52EGxb1evEJvnGh_bTfkAyt6sqqnFaBFs3JYQkdxvYQ-LrJDJu0GVpdw30f3sPLR_Hi9FTOnuZTEfDWeqYUm2arUDwggnLy0IxpojjQmnOMClkaWkBlkeL1pgRAKl4pjklmjmcldLFFeuh6Zm78nZrDqHa2_BlvK3Mj-DD2tjQVm4HpmQawEmOMbGcArdFAaSkQlPFsRA4su7OrEPw70doWrP1x1DH9w0VnHAppRLRxc4uF3zTBCj_rhJsTgGYcwDmFID5DYB9A6fAdws |
CitedBy_id | crossref_primary_10_1016_j_future_2023_04_021 crossref_primary_10_1007_s11227_023_05360_7 crossref_primary_10_5194_gmd_16_4367_2023 |
Cites_doi | 10.1002/cpe.3822 10.1007/s10586-017-1477-0 10.3390/app9194039 10.1109/JSTARS.2015.2427652 10.1007/s11432-015-0594-2 10.1175/AMSMONOGRAPHS-D-15-0041.1 10.1175/2008MWR2590.1 10.1109/JSTARS.2012.2186119 10.1007/s11227-011-0675-4 10.1175/MWR-D-11-00367.1 10.1007/s11227-020-03451-3 10.1007/s11704-019-8184-3 10.1016/j.cageo.2011.08.007 10.1016/j.future.2017.02.008 10.3390/app10020649 10.1007/s13174-012-0071-1 10.1109/MM.2010.41 10.1007/s11227-018-2406-6 10.1109/JSTARS.2014.2315771 10.1109/TC.2013.2295806 10.1117/12.2031450 10.2172/1172166 10.1109/ISPAW.2011.38 10.1029/2020MS002210 10.1016/j.jqsrt.2004.05.058 10.1080/16742834.2014.11447144 10.1016/j.future.2017.02.012 |
ContentType | Journal Article |
Copyright | Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2021 |
Copyright_xml | – notice: Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2021 |
DBID | AAYXX CITATION 7SC 7SP 7SR 8BQ 8FD JG9 JQ2 L7M L~C L~D DOA |
DOI | 10.1109/ACCESS.2021.3087507 |
DatabaseName | CrossRef Computer and Information Systems Abstracts Electronics & Communications Abstracts Engineered Materials Abstracts METADEX Technology Research Database Materials Research Database ProQuest Computer Science Collection Advanced Technologies Database with Aerospace Computer and Information Systems Abstracts Academic Computer and Information Systems Abstracts Professional DOAJ Directory of Open Access Journals |
DatabaseTitle | CrossRef Materials Research Database Engineered Materials Abstracts Technology Research Database Computer and Information Systems Abstracts – Academic Electronics & Communications Abstracts ProQuest Computer Science Collection Computer and Information Systems Abstracts Advanced Technologies Database with Aerospace METADEX Computer and Information Systems Abstracts Professional |
DatabaseTitleList | Materials Research Database |
Database_xml | – sequence: 1 dbid: DOA name: DOAJ Directory of Open Access Journals url: https://www.doaj.org/ sourceTypes: Open Website |
DeliveryMethod | fulltext_linktorsrc |
Discipline | Engineering |
EISSN | 2169-3536 |
EndPage | 84240 |
ExternalDocumentID | oai_doaj_org_article_f39eec64001a42e4abbe1f2592840550 10_1109_ACCESS_2021_3087507 |
GroupedDBID | 0R~ 4.4 5VS 6IK 97E AAJGR AAYXX ACGFS ADBBV ALMA_UNASSIGNED_HOLDINGS BCNDV BEFXN BFFAM BGNUA BKEBE BPEOZ CITATION EBS EJD ESBDL GROUPED_DOAJ IFIPE IPLJI JAVBF KQ8 M43 M~E O9- OCL OK1 RIE RIG RNS 7SC 7SP 7SR 8BQ 8FD JG9 JQ2 L7M L~C L~D |
ID | FETCH-LOGICAL-c388t-7de54b35a4fb83381c45894301b6fa2bea47de99031ee6847942193c07f6c47d3 |
IEDL.DBID | DOA |
ISSN | 2169-3536 |
IngestDate | Tue Oct 22 15:13:05 EDT 2024 Thu Oct 10 16:46:30 EDT 2024 Fri Aug 23 02:46:32 EDT 2024 |
IsDoiOpenAccess | true |
IsOpenAccess | true |
IsPeerReviewed | true |
IsScholarly | true |
Language | English |
LinkModel | DirectLink |
MergedId | FETCHMERGED-LOGICAL-c388t-7de54b35a4fb83381c45894301b6fa2bea47de99031ee6847942193c07f6c47d3 |
ORCID | 0000-0003-0449-2973 0000-0002-6985-4042 0000-0003-4463-8666 |
OpenAccessLink | https://doaj.org/article/f39eec64001a42e4abbe1f2592840550 |
PQID | 2541466685 |
PQPubID | 4845423 |
PageCount | 10 |
ParticipantIDs | doaj_primary_oai_doaj_org_article_f39eec64001a42e4abbe1f2592840550 proquest_journals_2541466685 crossref_primary_10_1109_ACCESS_2021_3087507 |
PublicationCentury | 2000 |
PublicationDate | 2021-00-00 20210101 2021-01-01 |
PublicationDateYYYYMMDD | 2021-01-01 |
PublicationDate_xml | – year: 2021 text: 2021-00-00 |
PublicationDecade | 2020 |
PublicationPlace | Piscataway |
PublicationPlace_xml | – name: Piscataway |
PublicationTitle | IEEE access |
PublicationYear | 2021 |
Publisher | The Institute of Electrical and Electronics Engineers, Inc. (IEEE) IEEE |
Publisher_xml | – name: The Institute of Electrical and Electronics Engineers, Inc. (IEEE) – name: IEEE |
References | ref13 ref12 ref15 ref14 ref10 ref2 ref1 ref17 ref16 ref19 ref18 (ref9) 2019 (ref11) 2019 ref24 ref23 ref26 ref25 ref20 ref22 ref21 ref28 ref27 ref29 ref8 ref7 ref4 ref3 ref6 ref5 ruetsch (ref30) 2013 |
References_xml | – ident: ref13 doi: 10.1002/cpe.3822 – ident: ref15 doi: 10.1007/s10586-017-1477-0 – ident: ref22 doi: 10.3390/app9194039 – ident: ref10 doi: 10.1109/JSTARS.2015.2427652 – ident: ref27 doi: 10.1007/s11432-015-0594-2 – ident: ref28 doi: 10.1175/AMSMONOGRAPHS-D-15-0041.1 – ident: ref3 doi: 10.1175/2008MWR2590.1 – ident: ref25 doi: 10.1109/JSTARS.2012.2186119 – ident: ref6 doi: 10.1007/s11227-011-0675-4 – ident: ref16 doi: 10.1175/MWR-D-11-00367.1 – ident: ref24 doi: 10.1007/s11227-020-03451-3 – ident: ref26 doi: 10.1007/s11704-019-8184-3 – ident: ref19 doi: 10.1016/j.cageo.2011.08.007 – ident: ref17 doi: 10.1016/j.future.2017.02.008 – ident: ref23 doi: 10.3390/app10020649 – ident: ref5 doi: 10.1007/s13174-012-0071-1 – ident: ref4 doi: 10.1109/MM.2010.41 – ident: ref14 doi: 10.1007/s11227-018-2406-6 – ident: ref21 doi: 10.1109/JSTARS.2014.2315771 – ident: ref7 doi: 10.1109/TC.2013.2295806 – ident: ref20 doi: 10.1117/12.2031450 – ident: ref2 doi: 10.2172/1172166 – ident: ref18 doi: 10.1109/ISPAW.2011.38 – year: 2019 ident: ref11 publication-title: CUDA Fortran programming guide and reference – year: 2019 ident: ref9 publication-title: CUDA C Programming Guide v10 0 – year: 2013 ident: ref30 publication-title: CUDA Fortran for Scientists and Engineers Best Practices for Efficient CUDA Fortran Programming contributor: fullname: ruetsch – ident: ref29 doi: 10.1029/2020MS002210 – ident: ref1 doi: 10.1016/j.jqsrt.2004.05.058 – ident: ref12 doi: 10.1080/16742834.2014.11447144 – ident: ref8 doi: 10.1016/j.future.2017.02.012 |
SSID | ssj0000816957 |
Score | 2.260475 |
Snippet | As high performance computing technology continues to develop rapidly, graphics processing units (GPUs) are more widely used for daily computing tasks.... |
SourceID | doaj proquest crossref |
SourceType | Open Website Aggregation Database |
StartPage | 84231 |
SubjectTerms | Algorithms Computation compute unified device architecture Data transmission FORTRAN General circulation models graphics processing unit Graphics processing units High performance computing Optimization Radiative transfer Short wave radiation shortwave radiative transfer Weather forecasting |
Title | GPU-RRTMG_SW: Accelerating a Shortwave Radiative Transfer Scheme on GPU |
URI | https://www.proquest.com/docview/2541466685 https://doaj.org/article/f39eec64001a42e4abbe1f2592840550 |
Volume | 9 |
hasFullText | 1 |
inHoldings | 1 |
isFullTextHit | |
isPrint | |
link | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwrV09T8MwELVQJxgQn6JQkAdGQuPGdhy2UtFWSEWoH6KbZbsXWEhRKfD3OTspisTAwhpbifPOufcust4RcslQ5cZIuxGqX4iQIWxkTcaj1Ap_ClCmJnRrGD3I4Yzfz8W81urLnwkr7YFL4Np5kgE4iVuNGd4BbqwFlqNox7wai6paj7NaMRVysGIyE2llM4Tj7W6vh2-EBWGHXXsXPOEbyNaoKDj2_0rIgWX6e2S3koe0Wy5rn2xBcUB2aqaBh2QweJxF4_F0NNCTpxvadQ6Zw8exeKaGTl5QT3-ZT6Bj7zrgkxkNfJTDik4wQq9AlwXFexyRWf9u2htGVTuEyCVKraN0AYLbRBieW4WVJXNcBPd0ZmVuOhYMxynILgkDkMpbx2M6Slyc5tLhUHJMGsWygBNC2cJJKYwCB4oDsraSiQVuU68WFmnWJFcbZPRb6XqhQ7UQZ7oEUnsgdQVkk9x69H6mesvqcAEDqatA6r8C2SStDfa6-o7edcd3KccKS4nT_3jGGdn26y5_obRIY736gHMUFWt7EfbPN5wGw5A |
link.rule.ids | 315,783,787,867,2109,4031,27935,27936,27937 |
linkProvider | Directory of Open Access Journals |
openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=GPU-RRTMG_SW%3A+Accelerating+a+Shortwave+Radiative+Transfer+Scheme+on+GPU&rft.jtitle=IEEE+access&rft.au=Wang%2C+Zhenzhen&rft.au=Wang%2C+Yuzhu&rft.au=Wang%2C+Xiaocong&rft.au=Li%2C+Fei&rft.date=2021&rft.issn=2169-3536&rft.eissn=2169-3536&rft.volume=9&rft.spage=84231&rft.epage=84240&rft_id=info:doi/10.1109%2FACCESS.2021.3087507&rft.externalDBID=n%2Fa&rft.externalDocID=10_1109_ACCESS_2021_3087507 |
thumbnail_l | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=2169-3536&client=summon |
thumbnail_m | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=2169-3536&client=summon |
thumbnail_s | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=2169-3536&client=summon |