Exploring Loop Scheduling Enhancements in OpenMP: An LLVM Case Study

OpenMP is the de-facto standard for parallel programming on shared-memory systems. The choice of scheduling methods in OpenMP work sharing parallel loops is a critical aspect for performance, especially for computationally-intensive and irregular parallel loops. In this work, we explore loop schedul...

Full description

Saved in:

Bibliographic Details
Published in	2019 18th International Symposium on Parallel and Distributed Computing (ISPDC) pp. 131 - 138
Main Authors	Kasielke, Franziska, Tschuter, Ronny, Iwainsky, Christian, Velten, Markus, Ciorba, Florina M., Banicescu, Ioana
Format	Conference Proceeding
Language	English
Published	IEEE 01.06.2019
Subjects	Benchmark testing dynamic load balancing dynamic loop self-scheduling Dynamic scheduling Heuristic algorithms LLVM OpenMP runtime load imbalance OpenMP performance study Processor scheduling Runtime work sharing loops
Online Access	Get full text
DOI	10.1109/ISPDC.2019.00026

Cover

Abstract	OpenMP is the de-facto standard for parallel programming on shared-memory systems. The choice of scheduling methods in OpenMP work sharing parallel loops is a critical aspect for performance, especially for computationally-intensive and irregular parallel loops. In this work, we explore loop scheduling enhancements in OpenMP. Three loop scheduling choices are covered today in the OpenMP standard: static, guided, and dynamic. These are no longer sufficient to address the load imbalance that adversely affects the execution of computationally-intensive and irregular parallel loops. In this work, we present a generic methodology for exploring loop scheduling enhancements in OpenMP that allows the implementation, testing, and usage of additional (more advanced) loop scheduling choices in OpenMP runtime systems. We showcase the methodology by enhancing the LLVM OpenMP runtime with an additional dynamic loop self-scheduling (DLS) technique, known to offer superior load balancing over the existing OpenMP scheduling choices for computationally-intensive and irregular parallel loops. We analyze the overhead of the (existing and newly added) OpenMP loop scheduling methods and show that the proposed methodology incurs no additional overhead. We also study the performance of four benchmarks using the enhanced LLVM OpenMP runtime. The results show that, for the four benchmarks considered, no single loop scheduling strategy outperforms the others. The newly implemented DLS technique provides an additional opportunity for improved execution time with the LLVM OpenMP runtime, which was not possible before this study. Our newly implemented scheduling strategy is competitive with the best previous scheduling choices. This methodology for exploring loop scheduling enhancements in OpenMP lays the foundation for further loop scheduling additions and explorations in OpenMP.
AbstractList	OpenMP is the de-facto standard for parallel programming on shared-memory systems. The choice of scheduling methods in OpenMP work sharing parallel loops is a critical aspect for performance, especially for computationally-intensive and irregular parallel loops. In this work, we explore loop scheduling enhancements in OpenMP. Three loop scheduling choices are covered today in the OpenMP standard: static, guided, and dynamic. These are no longer sufficient to address the load imbalance that adversely affects the execution of computationally-intensive and irregular parallel loops. In this work, we present a generic methodology for exploring loop scheduling enhancements in OpenMP that allows the implementation, testing, and usage of additional (more advanced) loop scheduling choices in OpenMP runtime systems. We showcase the methodology by enhancing the LLVM OpenMP runtime with an additional dynamic loop self-scheduling (DLS) technique, known to offer superior load balancing over the existing OpenMP scheduling choices for computationally-intensive and irregular parallel loops. We analyze the overhead of the (existing and newly added) OpenMP loop scheduling methods and show that the proposed methodology incurs no additional overhead. We also study the performance of four benchmarks using the enhanced LLVM OpenMP runtime. The results show that, for the four benchmarks considered, no single loop scheduling strategy outperforms the others. The newly implemented DLS technique provides an additional opportunity for improved execution time with the LLVM OpenMP runtime, which was not possible before this study. Our newly implemented scheduling strategy is competitive with the best previous scheduling choices. This methodology for exploring loop scheduling enhancements in OpenMP lays the foundation for further loop scheduling additions and explorations in OpenMP.
Author	Kasielke, Franziska Velten, Markus Banicescu, Ioana Ciorba, Florina M. Iwainsky, Christian Tschuter, Ronny
Author_xml	– sequence: 1 givenname: Franziska surname: Kasielke fullname: Kasielke, Franziska organization: Technische Universität Dresden – sequence: 2 givenname: Ronny surname: Tschuter fullname: Tschuter, Ronny organization: Technische Universität Dresden – sequence: 3 givenname: Christian surname: Iwainsky fullname: Iwainsky, Christian organization: Technische Universität Darmstadt – sequence: 4 givenname: Markus surname: Velten fullname: Velten, Markus organization: Technische Universität Dresden – sequence: 5 givenname: Florina M. surname: Ciorba fullname: Ciorba, Florina M. organization: University of Basel – sequence: 6 givenname: Ioana surname: Banicescu fullname: Banicescu, Ioana organization: Mississippi State University
BookMark	eNotjEFLwzAYQCPoQefugpf8gdYvSbsk3kZX56Bjg6rX8TX94gpdWtoO3L9X0dPj8eDdsevQBWLsQUAsBNinTblfZbEEYWMAkIsrNrfaCC2NUAZEestW-VffdkMTPnnRdT0v3ZHqc_vreThicHSiMI28CXzXU9jun_ky8KL42PIMR-LldK4v9-zGYzvS_J8z9v6Sv2WvUbFbb7JlETVCp1OUOIOp1x7TBMEJRKgSqqmyC19JVEYKScoo8LJ21lnyRISGfOUqI-tUqRl7_Ps2P-XQD80Jh8vBaAtGafUNrmBIWw
CODEN	IEEPAD
ContentType	Conference Proceeding
DBID	6IE 6IL CBEJK RIE RIL
DOI	10.1109/ISPDC.2019.00026
DatabaseName	IEEE Electronic Library (IEL) Conference Proceedings IEEE Proceedings Order Plan All Online (POP All Online) 1998-present by volume IEEE Xplore All Conference Proceedings IEEE/IET Electronic Library IEEE Proceedings Order Plans (POP All) 1998-Present
DatabaseTitleList
Database_xml	– sequence: 1 dbid: RIE name: IEEE/IET Electronic Library url: https://proxy.k.utb.cz/login?url=https://ieeexplore.ieee.org/ sourceTypes: Publisher
DeliveryMethod	fulltext_linktorsrc
EISBN	9781728138015 1728138019
EndPage	138
ExternalDocumentID	8790837
Genre	orig-research
GroupedDBID	6IE 6IL CBEJK RIE RIL
ID	FETCH-LOGICAL-i175t-4c8a5f7fa54a0c1aa0b4edeb96fb2a38212e3830f2dc9c9efeeea8efbcb82d533
IEDL.DBID	RIE
IngestDate	Thu Jun 29 18:39:04 EDT 2023
IsPeerReviewed	false
IsScholarly	false
Language	English
LinkModel	DirectLink
MergedId	FETCHMERGED-LOGICAL-i175t-4c8a5f7fa54a0c1aa0b4edeb96fb2a38212e3830f2dc9c9efeeea8efbcb82d533
PageCount	8
ParticipantIDs	ieee_primary_8790837
PublicationCentury	2000
PublicationDate	2019-Jun
PublicationDateYYYYMMDD	2019-06-01
PublicationDate_xml	– month: 06 year: 2019 text: 2019-Jun
PublicationDecade	2010
PublicationTitle	2019 18th International Symposium on Parallel and Distributed Computing (ISPDC)
PublicationTitleAbbrev	ISPDC
PublicationYear	2019
Publisher	IEEE
Publisher_xml	– name: IEEE
Score	1.754875
Snippet	OpenMP is the de-facto standard for parallel programming on shared-memory systems. The choice of scheduling methods in OpenMP work sharing parallel loops is a...
SourceID	ieee
SourceType	Publisher
StartPage	131
SubjectTerms	Benchmark testing dynamic load balancing dynamic loop self-scheduling Dynamic scheduling Heuristic algorithms LLVM OpenMP runtime load imbalance OpenMP performance study Processor scheduling Runtime work sharing loops
Title	Exploring Loop Scheduling Enhancements in OpenMP: An LLVM Case Study
URI	https://ieeexplore.ieee.org/document/8790837
hasFullText	1
inHoldings	1
isFullTextHit
isPrint
link	http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV09T8MwED21nZgAtYhveWAkbUidOGZD_VBBDapUirpVZ8eGCimpUDrAr-ectAUhBjbLi30-S-_se-8O4KqbEkpzFJ5C33oUEQtPBqjolYICOaISkVMjJ4_RaMYf5uG8Btc7LYwxpiSfmbYblrn8NNdr91XWiYWkiEHUoU7XrNJqbTOPvuzcTyf9niNruQqUZbWEH_1SSrgY7kOyXahiiby114Vq689fNRj_u5MDaH0L89hkBzmHUDNZE_o7Hh0b5_mKTckPqSOYv7BB9uq8WsrY2DJjjj6STG7ZXcbG4-eE9QjDmKMSfrRgNhw89UbepjmCtyTELzyuYwytsBhy9PUNoq-4SY2SkVUBdmOCJEOvT98GqZZaGks2YGys0ioOUgryjqCR5Zk5BiaxS3MmpGBLcp4S5tvYZd9kJKS2kT6BpjuBxaqqf7HYGH_69_QZ7DkfVHSqc2gU72tzQcBdqMvSY1_NK5r8
linkProvider	IEEE
linkToHtml	http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjZ09T8MwEIatUgaYALWIbzwwkjakThyzoX6ohaSq1BZ1q86ODRVSUqF0gF_POWkLQgxskZfYvuE5-973TMhNK0FKM-COBNc4mBFzR3gg8ZQCHBiA5IF1I8fDoD9ljzN_ViG3Wy-M1roQn-mG_Sxq-UmmVvaqrBlygRkD3yG7yH3ml26tTe3RFc3BeNRpW7mW7UFZ9Ev48WJKAYzeAYk3vyp1Im-NVS4b6vNXF8b_zuWQ1L-teXS0hc4Rqei0RjpbJR2NsmxJxxiJxErMX2g3fbVxLYxsdJFSKyCJR_f0IaVR9BzTNlKMWjHhR51Me91Ju--sn0dwFsj83GEqBN9wAz4DV90BuJLpREsRGOlBK0QoaTx_usZLlFBCG1wDhNpIJUMvwTTvmFTTLNUnhApo4Zj2Md0SjCVIfRPa-psIuFAmUKekZndgviw7YMzXiz_7e_ia7PUncTSPBsOnc7Jv41GKqy5INX9f6UvEeC6viuh9ARsbnkk
openUrl	ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=proceeding&rft.title=2019+18th+International+Symposium+on+Parallel+and+Distributed+Computing+%28ISPDC%29&rft.atitle=Exploring+Loop+Scheduling+Enhancements+in+OpenMP%3A+An+LLVM+Case+Study&rft.au=Kasielke%2C+Franziska&rft.au=Tschuter%2C+Ronny&rft.au=Iwainsky%2C+Christian&rft.au=Velten%2C+Markus&rft.date=2019-06-01&rft.pub=IEEE&rft.spage=131&rft.epage=138&rft_id=info:doi/10.1109%2FISPDC.2019.00026&rft.externalDocID=8790837