Functionally Arranged Data for Algorithms with Space-Time Wavefront
Algorithms with space-time tiling increase the performance of numerical simulations by increasing data reuse and arithmetic intensity; they also improve parallel scaling by making process synchronization less frequent. The theory of Locally Recursive non-Locally Asynchronous (LRnLA) algorithms provi...
Saved in:
Published in | Parallel Computational Technologies Vol. 1437; pp. 134 - 148 |
---|---|
Main Authors | , |
Format | Book Chapter |
Language | English |
Published |
Switzerland
Springer International Publishing AG
2021
Springer International Publishing |
Series | Communications in Computer and Information Science |
Subjects | |
Online Access | Get full text |
ISBN | 9783030816902 3030816907 |
ISSN | 1865-0929 1865-0937 |
DOI | 10.1007/978-3-030-81691-9_10 |
Cover
Abstract | Algorithms with space-time tiling increase the performance of numerical simulations by increasing data reuse and arithmetic intensity; they also improve parallel scaling by making process synchronization less frequent. The theory of Locally Recursive non-Locally Asynchronous (LRnLA) algorithms provides the performance model with account for data localization at all levels of the memory hierarchy. However, effective implementation is difficult since modern optimizing compilers do not support the required traversal methods and data structures by default. The data exchange is typically implemented by writing the updated values to the main data array. Here, we suggest a new data structure that contains the partially updated state of the simulation domain. Data is arranged within this structure for coalesced access and seamless exchange between subtasks. We demonstrate the preliminary results of its superiority over previously used methods by localizing the processed data in the L2 GPU cache for the Lattice Boltzmann Method (LBM) simulation so that the performance is not limited by the GDDR throughput but is determined by the L2 cache access rate. If we estimate the ideal stepwise code performance to be memory-bound with a read/write ratio equal to 1 and assume it is localized in the GPU memory and performs at 100% of the theoretical memory bandwidth, then the results of our benchmarks exceed that peak by a factor of the order of 1.2. |
---|---|
AbstractList | Algorithms with space-time tiling increase the performance of numerical simulations by increasing data reuse and arithmetic intensity; they also improve parallel scaling by making process synchronization less frequent. The theory of Locally Recursive non-Locally Asynchronous (LRnLA) algorithms provides the performance model with account for data localization at all levels of the memory hierarchy. However, effective implementation is difficult since modern optimizing compilers do not support the required traversal methods and data structures by default. The data exchange is typically implemented by writing the updated values to the main data array. Here, we suggest a new data structure that contains the partially updated state of the simulation domain. Data is arranged within this structure for coalesced access and seamless exchange between subtasks. We demonstrate the preliminary results of its superiority over previously used methods by localizing the processed data in the L2 GPU cache for the Lattice Boltzmann Method (LBM) simulation so that the performance is not limited by the GDDR throughput but is determined by the L2 cache access rate. If we estimate the ideal stepwise code performance to be memory-bound with a read/write ratio equal to 1 and assume it is localized in the GPU memory and performs at 100% of the theoretical memory bandwidth, then the results of our benchmarks exceed that peak by a factor of the order of 1.2. |
Author | Perepelkina, Anastasia Levchenko, Vadim D. |
Author_xml | – sequence: 1 givenname: Anastasia orcidid: 0000-0003-2517-6064 surname: Perepelkina fullname: Perepelkina, Anastasia email: mogmi@narod.ru – sequence: 2 givenname: Vadim D. orcidid: 0000-0003-3623-0556 surname: Levchenko fullname: Levchenko, Vadim D. |
BookMark | eNo9kM1OAjEURquiEZA3cDEvUO1_6ZKgqAmJC0lcNqXTgdFhOrZF49vbAXX1Jd-95-bmjMCg9a0D4BqjG4yQvFVyCilEFMEpFgpDpTE6ASOam0PBT8EQTwWHSFF5BiZ5_2-GyOB_RtQFGGEiEGWcT-klmMT4hhAikgiC2BDMF_vWptq3pmm-i1kIpt24srgzyRSVD8Ws2fhQp-0uFl85ipfOWAdX9c4Vr-bTVcG36QqcV6aJbvKbY7Ba3K_mj3D5_PA0ny1hRxhNUNC1oY5ZjnjFiGGiFNaVhpSWcom5lIg5zEquRGUoc7yqSumULdmaO4MVHQNyPBu7UOcvg157_x6zGN0r01mBpjpL0AdDuleWIXaEuuA_9i4m7XrKujYF09it6ZILUQshBZG0JzRmjP4ANbJr9g |
ContentType | Book Chapter |
Copyright | Springer Nature Switzerland AG 2021 |
Copyright_xml | – notice: Springer Nature Switzerland AG 2021 |
DBID | FFUUA |
DEWEY | 004.35 |
DOI | 10.1007/978-3-030-81691-9_10 |
DatabaseName | ProQuest Ebook Central - Book Chapters - Demo use only |
DatabaseTitleList | |
DeliveryMethod | fulltext_linktorsrc |
Discipline | Computer Science |
EISBN | 3030816915 9783030816919 |
EISSN | 1865-0937 |
Editor | Zymbler, Mikhail Sokolinsky, Leonid |
Editor_xml | – sequence: 1 fullname: Zymbler, Mikhail – sequence: 2 fullname: Sokolinsky, Leonid |
EndPage | 148 |
ExternalDocumentID | EBC6676273_103_144 |
GroupedDBID | 38. 9-X AABBV AABLV ABNDO ACBPT ACWLQ AEJLV AEKFX AELOD AIYYB ALMA_UNASSIGNED_HOLDINGS BAHJK BBABE CZZ DBWEY FFUUA I4C IEZ OCUHQ ORHYB SBO SNUHX TPJZQ Z83 Z84 Z88 |
ID | FETCH-LOGICAL-p243t-63ba3e4c505f42a46d6ceda2dc357157704e14d596fa34e5ffd7e9cd4b5ea193 |
ISBN | 9783030816902 3030816907 |
ISSN | 1865-0929 |
IngestDate | Tue Jul 29 20:28:49 EDT 2025 Thu May 29 00:32:57 EDT 2025 |
IsPeerReviewed | true |
IsScholarly | true |
LCCallNum | QA76.9.S88 |
Language | English |
LinkModel | OpenURL |
MergedId | FETCHMERGED-LOGICAL-p243t-63ba3e4c505f42a46d6ceda2dc357157704e14d596fa34e5ffd7e9cd4b5ea193 |
OCLC | 1260345583 |
ORCID | 0000-0003-2517-6064 0000-0003-3623-0556 |
PQID | EBC6676273_103_144 |
PageCount | 15 |
ParticipantIDs | springer_books_10_1007_978_3_030_81691_9_10 proquest_ebookcentralchapters_6676273_103_144 |
PublicationCentury | 2000 |
PublicationDate | 2021 |
PublicationDateYYYYMMDD | 2021-01-01 |
PublicationDate_xml | – year: 2021 text: 2021 |
PublicationDecade | 2020 |
PublicationPlace | Switzerland |
PublicationPlace_xml | – name: Switzerland – name: Cham |
PublicationSeriesTitle | Communications in Computer and Information Science |
PublicationSeriesTitleAlternate | Communic.Comp.Inf.Science |
PublicationSubtitle | 15th International Conference, PCT 2021, Volgograd, Russia, March 30 - April 1, 2021, Revised Selected Papers |
PublicationTitle | Parallel Computational Technologies |
PublicationYear | 2021 |
Publisher | Springer International Publishing AG Springer International Publishing |
Publisher_xml | – name: Springer International Publishing AG – name: Springer International Publishing |
RelatedPersons | Zhou, Lizhu Filipe, Joaquim Ghosh, Ashish Prates, Raquel Oliveira |
RelatedPersons_xml | – sequence: 1 givenname: Joaquim orcidid: 0000-0002-5961-6606 surname: Filipe fullname: Filipe, Joaquim – sequence: 2 givenname: Ashish surname: Ghosh fullname: Ghosh, Ashish – sequence: 3 givenname: Raquel Oliveira orcidid: 0000-0002-7128-4974 surname: Prates fullname: Prates, Raquel Oliveira – sequence: 4 givenname: Lizhu surname: Zhou fullname: Zhou, Lizhu |
SSID | ssj0002726204 ssj0000580895 ssib054953581 |
Score | 1.9971848 |
Snippet | Algorithms with space-time tiling increase the performance of numerical simulations by increasing data reuse and arithmetic intensity; they also improve... |
SourceID | springer proquest |
SourceType | Publisher |
StartPage | 134 |
SubjectTerms | Data structure Loop skewing LRnLA algorithms Parallel algorithms Temporal blocking |
Title | Functionally Arranged Data for Algorithms with Space-Time Wavefront |
URI | http://ebookcentral.proquest.com/lib/SITE_ID/reader.action?docID=6676273&ppg=144 http://link.springer.com/10.1007/978-3-030-81691-9_10 |
Volume | 1437 |
hasFullText | 1 |
inHoldings | 1 |
isFullTextHit | |
isPrint | |
link | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwnV1Lb9QwELbockEceIvykg_cVkZrZ2x3j8tqoaqACwv0ZjmxA0jbLUpDpfLrmck7oZdySBRFceTMlzjj8XzfMPY6aJmGhbVCB0gF4ORLLMFmwuDQZ-ICvDRETv74yRx_gZNTfdpXBazYJWX6JvtzLa_kf1DFc4grsWRvgGx3UzyBx4gv7hFh3E-c33GYtRa98AXVQdnN68IMbVCvi5V32YFNmSL8gdWX7K7mq6IgUkFA0EtfZRqudt_Pi5_lj7OG7vYZ59JREEFk_s1fxpx0DoYRAiUnEYI2QjiJMQ7CXKv3o1llQho2tHw2HiahVmf5Z8wdpllgS0FNpVi6Jl11JHEta7HHicT15u2aUm3Rk8JGuAEcsAN7BDN2e7U5-fC1C5opW-nnE0en7aOtVZT6Pg_4kdf1aTSTmCx-Vz7F9j67SzwTTgQQ7OUDdivuH7J7bZUN3gy6j9h6iBxvkeOEHEfkeI8cJ-R4jxzvkHvMtu822_WxaEpfiF8KklKYJPVJhAz90xyUBxNMFoNXIUu0ldraBUQJQS9N7hOIOs-DjcsMvzcdPfrkT9hsf76PTxk3WulUemWsV2AX6RGViTU55BG_RCWTQyZae7hqfb5JCs7qp79wE2QO2bw1mqPLL1wrfI3WdolDa7vK2o6s_eyGd3_O7vRv8As2K4vf8SV6fWX6qnkX_gLcSVJr |
linkProvider | Library Specific Holdings |
openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=bookitem&rft.title=Parallel+Computational+Technologies&rft.atitle=Functionally+Arranged+Data+for+Algorithms+with+Space-Time+Wavefront&rft.date=2021-01-01&rft.pub=Springer+International+Publishing+AG&rft.isbn=9783030816902&rft.volume=1437&rft_id=info:doi/10.1007%2F978-3-030-81691-9_10&rft.externalDBID=144&rft.externalDocID=EBC6676273_103_144 |
thumbnail_s | http://utb.summon.serialssolutions.com/2.0.0/image/custom?url=https%3A%2F%2Febookcentral.proquest.com%2Fcovers%2F6676273-l.jpg |