Functionally Arranged Data for Algorithms with Space-Time Wavefront

Algorithms with space-time tiling increase the performance of numerical simulations by increasing data reuse and arithmetic intensity; they also improve parallel scaling by making process synchronization less frequent. The theory of Locally Recursive non-Locally Asynchronous (LRnLA) algorithms provi...

Full description

Saved in:
Bibliographic Details
Published inParallel Computational Technologies Vol. 1437; pp. 134 - 148
Main Authors Perepelkina, Anastasia, Levchenko, Vadim D.
Format Book Chapter
LanguageEnglish
Published Switzerland Springer International Publishing AG 2021
Springer International Publishing
SeriesCommunications in Computer and Information Science
Subjects
Online AccessGet full text
ISBN9783030816902
3030816907
ISSN1865-0929
1865-0937
DOI10.1007/978-3-030-81691-9_10

Cover

Abstract Algorithms with space-time tiling increase the performance of numerical simulations by increasing data reuse and arithmetic intensity; they also improve parallel scaling by making process synchronization less frequent. The theory of Locally Recursive non-Locally Asynchronous (LRnLA) algorithms provides the performance model with account for data localization at all levels of the memory hierarchy. However, effective implementation is difficult since modern optimizing compilers do not support the required traversal methods and data structures by default. The data exchange is typically implemented by writing the updated values to the main data array. Here, we suggest a new data structure that contains the partially updated state of the simulation domain. Data is arranged within this structure for coalesced access and seamless exchange between subtasks. We demonstrate the preliminary results of its superiority over previously used methods by localizing the processed data in the L2 GPU cache for the Lattice Boltzmann Method (LBM) simulation so that the performance is not limited by the GDDR throughput but is determined by the L2 cache access rate. If we estimate the ideal stepwise code performance to be memory-bound with a read/write ratio equal to 1 and assume it is localized in the GPU memory and performs at 100% of the theoretical memory bandwidth, then the results of our benchmarks exceed that peak by a factor of the order of 1.2.
AbstractList Algorithms with space-time tiling increase the performance of numerical simulations by increasing data reuse and arithmetic intensity; they also improve parallel scaling by making process synchronization less frequent. The theory of Locally Recursive non-Locally Asynchronous (LRnLA) algorithms provides the performance model with account for data localization at all levels of the memory hierarchy. However, effective implementation is difficult since modern optimizing compilers do not support the required traversal methods and data structures by default. The data exchange is typically implemented by writing the updated values to the main data array. Here, we suggest a new data structure that contains the partially updated state of the simulation domain. Data is arranged within this structure for coalesced access and seamless exchange between subtasks. We demonstrate the preliminary results of its superiority over previously used methods by localizing the processed data in the L2 GPU cache for the Lattice Boltzmann Method (LBM) simulation so that the performance is not limited by the GDDR throughput but is determined by the L2 cache access rate. If we estimate the ideal stepwise code performance to be memory-bound with a read/write ratio equal to 1 and assume it is localized in the GPU memory and performs at 100% of the theoretical memory bandwidth, then the results of our benchmarks exceed that peak by a factor of the order of 1.2.
Author Perepelkina, Anastasia
Levchenko, Vadim D.
Author_xml – sequence: 1
  givenname: Anastasia
  orcidid: 0000-0003-2517-6064
  surname: Perepelkina
  fullname: Perepelkina, Anastasia
  email: mogmi@narod.ru
– sequence: 2
  givenname: Vadim D.
  orcidid: 0000-0003-3623-0556
  surname: Levchenko
  fullname: Levchenko, Vadim D.
BookMark eNo9kM1OAjEURquiEZA3cDEvUO1_6ZKgqAmJC0lcNqXTgdFhOrZF49vbAXX1Jd-95-bmjMCg9a0D4BqjG4yQvFVyCilEFMEpFgpDpTE6ASOam0PBT8EQTwWHSFF5BiZ5_2-GyOB_RtQFGGEiEGWcT-klmMT4hhAikgiC2BDMF_vWptq3pmm-i1kIpt24srgzyRSVD8Ws2fhQp-0uFl85ipfOWAdX9c4Vr-bTVcG36QqcV6aJbvKbY7Ba3K_mj3D5_PA0ny1hRxhNUNC1oY5ZjnjFiGGiFNaVhpSWcom5lIg5zEquRGUoc7yqSumULdmaO4MVHQNyPBu7UOcvg157_x6zGN0r01mBpjpL0AdDuleWIXaEuuA_9i4m7XrKujYF09it6ZILUQshBZG0JzRmjP4ANbJr9g
ContentType Book Chapter
Copyright Springer Nature Switzerland AG 2021
Copyright_xml – notice: Springer Nature Switzerland AG 2021
DBID FFUUA
DEWEY 004.35
DOI 10.1007/978-3-030-81691-9_10
DatabaseName ProQuest Ebook Central - Book Chapters - Demo use only
DatabaseTitleList
DeliveryMethod fulltext_linktorsrc
Discipline Computer Science
EISBN 3030816915
9783030816919
EISSN 1865-0937
Editor Zymbler, Mikhail
Sokolinsky, Leonid
Editor_xml – sequence: 1
  fullname: Zymbler, Mikhail
– sequence: 2
  fullname: Sokolinsky, Leonid
EndPage 148
ExternalDocumentID EBC6676273_103_144
GroupedDBID 38.
9-X
AABBV
AABLV
ABNDO
ACBPT
ACWLQ
AEJLV
AEKFX
AELOD
AIYYB
ALMA_UNASSIGNED_HOLDINGS
BAHJK
BBABE
CZZ
DBWEY
FFUUA
I4C
IEZ
OCUHQ
ORHYB
SBO
SNUHX
TPJZQ
Z83
Z84
Z88
ID FETCH-LOGICAL-p243t-63ba3e4c505f42a46d6ceda2dc357157704e14d596fa34e5ffd7e9cd4b5ea193
ISBN 9783030816902
3030816907
ISSN 1865-0929
IngestDate Tue Jul 29 20:28:49 EDT 2025
Thu May 29 00:32:57 EDT 2025
IsPeerReviewed true
IsScholarly true
LCCallNum QA76.9.S88
Language English
LinkModel OpenURL
MergedId FETCHMERGED-LOGICAL-p243t-63ba3e4c505f42a46d6ceda2dc357157704e14d596fa34e5ffd7e9cd4b5ea193
OCLC 1260345583
ORCID 0000-0003-2517-6064
0000-0003-3623-0556
PQID EBC6676273_103_144
PageCount 15
ParticipantIDs springer_books_10_1007_978_3_030_81691_9_10
proquest_ebookcentralchapters_6676273_103_144
PublicationCentury 2000
PublicationDate 2021
PublicationDateYYYYMMDD 2021-01-01
PublicationDate_xml – year: 2021
  text: 2021
PublicationDecade 2020
PublicationPlace Switzerland
PublicationPlace_xml – name: Switzerland
– name: Cham
PublicationSeriesTitle Communications in Computer and Information Science
PublicationSeriesTitleAlternate Communic.Comp.Inf.Science
PublicationSubtitle 15th International Conference, PCT 2021, Volgograd, Russia, March 30 - April 1, 2021, Revised Selected Papers
PublicationTitle Parallel Computational Technologies
PublicationYear 2021
Publisher Springer International Publishing AG
Springer International Publishing
Publisher_xml – name: Springer International Publishing AG
– name: Springer International Publishing
RelatedPersons Zhou, Lizhu
Filipe, Joaquim
Ghosh, Ashish
Prates, Raquel Oliveira
RelatedPersons_xml – sequence: 1
  givenname: Joaquim
  orcidid: 0000-0002-5961-6606
  surname: Filipe
  fullname: Filipe, Joaquim
– sequence: 2
  givenname: Ashish
  surname: Ghosh
  fullname: Ghosh, Ashish
– sequence: 3
  givenname: Raquel Oliveira
  orcidid: 0000-0002-7128-4974
  surname: Prates
  fullname: Prates, Raquel Oliveira
– sequence: 4
  givenname: Lizhu
  surname: Zhou
  fullname: Zhou, Lizhu
SSID ssj0002726204
ssj0000580895
ssib054953581
Score 1.9971848
Snippet Algorithms with space-time tiling increase the performance of numerical simulations by increasing data reuse and arithmetic intensity; they also improve...
SourceID springer
proquest
SourceType Publisher
StartPage 134
SubjectTerms Data structure
Loop skewing
LRnLA algorithms
Parallel algorithms
Temporal blocking
Title Functionally Arranged Data for Algorithms with Space-Time Wavefront
URI http://ebookcentral.proquest.com/lib/SITE_ID/reader.action?docID=6676273&ppg=144
http://link.springer.com/10.1007/978-3-030-81691-9_10
Volume 1437
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
link http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwnV1Lb9QwELbockEceIvykg_cVkZrZ2x3j8tqoaqACwv0ZjmxA0jbLUpDpfLrmck7oZdySBRFceTMlzjj8XzfMPY6aJmGhbVCB0gF4ORLLMFmwuDQZ-ICvDRETv74yRx_gZNTfdpXBazYJWX6JvtzLa_kf1DFc4grsWRvgGx3UzyBx4gv7hFh3E-c33GYtRa98AXVQdnN68IMbVCvi5V32YFNmSL8gdWX7K7mq6IgUkFA0EtfZRqudt_Pi5_lj7OG7vYZ59JREEFk_s1fxpx0DoYRAiUnEYI2QjiJMQ7CXKv3o1llQho2tHw2HiahVmf5Z8wdpllgS0FNpVi6Jl11JHEta7HHicT15u2aUm3Rk8JGuAEcsAN7BDN2e7U5-fC1C5opW-nnE0en7aOtVZT6Pg_4kdf1aTSTmCx-Vz7F9j67SzwTTgQQ7OUDdivuH7J7bZUN3gy6j9h6iBxvkeOEHEfkeI8cJ-R4jxzvkHvMtu822_WxaEpfiF8KklKYJPVJhAz90xyUBxNMFoNXIUu0ldraBUQJQS9N7hOIOs-DjcsMvzcdPfrkT9hsf76PTxk3WulUemWsV2AX6RGViTU55BG_RCWTQyZae7hqfb5JCs7qp79wE2QO2bw1mqPLL1wrfI3WdolDa7vK2o6s_eyGd3_O7vRv8As2K4vf8SV6fWX6qnkX_gLcSVJr
linkProvider Library Specific Holdings
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=bookitem&rft.title=Parallel+Computational+Technologies&rft.atitle=Functionally+Arranged+Data+for+Algorithms+with+Space-Time+Wavefront&rft.date=2021-01-01&rft.pub=Springer+International+Publishing+AG&rft.isbn=9783030816902&rft.volume=1437&rft_id=info:doi/10.1007%2F978-3-030-81691-9_10&rft.externalDBID=144&rft.externalDocID=EBC6676273_103_144
thumbnail_s http://utb.summon.serialssolutions.com/2.0.0/image/custom?url=https%3A%2F%2Febookcentral.proquest.com%2Fcovers%2F6676273-l.jpg