Profile-driven instruction level parallel scheduling with application to super blocks

Code scheduling to exploit instruction level parallelism (ILP) is a critical problem in compiler optimization research in light of the increased use of long-instruction-word machines. Unfortunately optimum scheduling is computationally intractable, and one must resort to carefully crafted heuristics...

Full description

Saved in:

Bibliographic Details
Published in	Proceedings of the 29th annual ACM/IEEE international symposium on Microarchitecture pp. 58 - 67
Main Authors	Chekuri, C., Johnson, R., Motwani, R., Natarajan, B., Rau, B. R., Schlansker, M.
Format	Conference Proceeding
Language	English
Published	Washington, DC, USA IEEE Computer Society 02.12.1996
Series	ACM Conferences
Subjects	Hardware > Electronic design automation > Logic synthesis > Circuit optimization Mathematics of computing > Discrete mathematics > Graph theory > Graph algorithms Theory of computation > Randomness, geometry and discrete structures linear code regions optimising compilers abstract model scheduling heuristic long-instruction-word machines optimum scheduling profile-sensitive scheduler ranking branch instructions code scheduling compiler optimization profile-driven instruction level parallel scheduling
Online Access	Get full text

Cover

Loading…

Abstract	Code scheduling to exploit instruction level parallelism (ILP) is a critical problem in compiler optimization research in light of the increased use of long-instruction-word machines. Unfortunately optimum scheduling is computationally intractable, and one must resort to carefully crafted heuristics in practice. If the scope of application of a scheduling heuristic is limited to basic blocks, considerable performance loss may be incurred at block boundaries. To overcome this obstacle, basic blocks can be coalesced across branches to form larger regions such as super blocks. In the literature, these regions are typically scheduled using algorithms that are either oblivious to profile information (under the assumption that the process of forming the region has fully utilized the profile information), or use the profile information as an addendum to classical scheduling techniques. We believe that even for the simple case of linear code regions such as super blocks, additional performance improvement can be gained by utilizing the profile information in scheduling as well. We propose a general paradigm for converting any profile-insensitive list scheduler to a profile-sensitive scheduler. Our technique is developed via a theoretical analysis of a simplified abstract model of the general problem of profile-driven scheduling over any acyclic code region, yielding a scoring measure for ranking branch instructions.
AbstractList	Code scheduling to exploit instruction level parallelism (ILP) is a critical problem in compiler optimization research in light of the increased use of long-instruction-word machines. Unfortunately optimum scheduling is computationally intractable, and one must resort to carefully crafted heuristics in practice. If the scope of application of a scheduling heuristic is limited to basic blocks, considerable performance loss may be incurred at block boundaries. To overcome this obstacle, basic blocks can be coalesced across branches to form larger regions such as super blocks. In the literature, these regions are typically scheduled using algorithms that are either oblivious to profile information (under the assumption that the process of forming the region has fully utilized the profile information), or use the profile information as an addendum to classical scheduling techniques. We believe that even for the simple case of linear code regions such as super blocks, additional performance improvement can be gained by utilizing the profile information in scheduling as well. We propose a general paradigm for converting any profile-insensitive list scheduler to a profile-sensitive scheduler. Our technique is developed via a theoretical analysis of a simplified abstract model of the general problem of profile-driven scheduling over any acyclic code region, yielding a scoring measure for ranking branch instructions.
Author	Natarajan, B. Rau, B. R. Schlansker, M. Motwani, R. Chekuri, C. Johnson, R.
Author_xml	– sequence: 1 givenname: C. surname: Chekuri fullname: Chekuri, C. organization: Dept. of Comp. Sci., Stanford Univ., Stanford, CA – sequence: 2 givenname: R. surname: Johnson fullname: Johnson, R. organization: Hewlett Packard Labs, 1501 Page Mill Rd, Palo Alto, CA – sequence: 3 givenname: R. surname: Motwani fullname: Motwani, R. organization: Dept. of Comp. Sci., Stanford Univ., Stanford, CA – sequence: 4 givenname: B. surname: Natarajan fullname: Natarajan, B. organization: Hewlett Packard Labs, 1501 Page Mill Rd, Palo Alto, CA – sequence: 5 givenname: B. R. surname: Rau fullname: Rau, B. R. organization: Hewlett Packard Labs, 1501 Page Mill Rd, Palo Alto, CA – sequence: 6 givenname: M. surname: Schlansker fullname: Schlansker, M. organization: Hewlett Packard Labs, 1501 Page Mill Rd, Palo Alto, CA
BookMark	eNqNjj1vwjAURZ8ESHxlZPfUqUCMk-DOVVFHBpgtx7yA4WFHdgJ_vwH6A7jLvcPR1RlD33mHADOeLvIuy1UmZFYsHpXLHiRfa5lKLot1kXExhCTGc9olyzkXYgT7bfCVJZwfgr2hY9bFJrSmsd4xwhsSq3XQRN2I5oSHlqw7srttTkzXNVmjn2jjWWxrDKwkby5xCoNKU8TkvyfwsfnZff_Otbmq0vtLVDxVD1_18lUvXzGBz7dAVQaLlXj79w8ii1R3
ContentType	Conference Proceeding
Copyright	Copyright (c) 1996 Institute of Electrical and Electronics Engineers, Inc. All rights reserved.
Copyright_xml	– notice: Copyright (c) 1996 Institute of Electrical and Electronics Engineers, Inc. All rights reserved.
DOI	10.5555/243846.243858
DatabaseTitleList
DeliveryMethod	fulltext_linktorsrc
EndPage	67
GroupedDBID	6IE 6IK AAJGR ACGHX ACM ADPZR ALMA_UNASSIGNED_HOLDINGS APO BEFXN BFFAM BGNUA BKEBE BPEOZ CBEJK GUFHI OCL RIB RIC RIE
ID	FETCH-acm_books_10_5555_243846_2438583
ISBN	9780818676413 0818676418
IngestDate	Wed Jan 31 06:40:11 EST 2024 Wed Jan 31 06:40:11 EST 2024
IsPeerReviewed	false
IsScholarly	false
Keywords	linear code regions optimising compilers abstract model scheduling heuristic long-instruction-word machines optimum scheduling profile-sensitive scheduler ranking branch instructions code scheduling compiler optimization profile-driven instruction level parallel scheduling
Language	English
LinkModel	OpenURL
MeetingName	MICRO96: 29th Annual International Symposium on Microarchitecture
MergedId	FETCHMERGED-acm_books_10_5555_243846_2438583
ParticipantIDs	acm_books_10_5555_243846_243858_brief acm_books_10_5555_243846_243858
PublicationCentury	1900
PublicationDate	19961202
PublicationDateYYYYMMDD	1996-12-02
PublicationDate_xml	– month: 12 year: 1996 text: 19961202 day: 02
PublicationDecade	1990
PublicationPlace	Washington, DC, USA
PublicationPlace_xml	– name: Washington, DC, USA
PublicationSeriesTitle	ACM Conferences
PublicationTitle	Proceedings of the 29th annual ACM/IEEE international symposium on Microarchitecture
PublicationYear	1996
Publisher	IEEE Computer Society
Publisher_xml	– name: IEEE Computer Society
SSID	ssj0000451133
Score	2.5341792
Snippet	Code scheduling to exploit instruction level parallelism (ILP) is a critical problem in compiler optimization research in light of the increased use of...
SourceID	acm
SourceType	Publisher
StartPage	58
SubjectTerms	Hardware -- Electronic design automation -- Logic synthesis -- Circuit optimization Mathematics of computing -- Discrete mathematics -- Graph theory -- Graph algorithms Theory of computation -- Randomness, geometry and discrete structures
Title	Profile-driven instruction level parallel scheduling with application to super blocks
hasFullText	1
inHoldings	1
isFullTextHit
isPrint
link	http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwnV3NT8IwFG-AkzeNGPGzB_VChqnrxjjqoiEmIxwg4baso4TIAMNGjN78z32vZVsxJEQ5jOWFvJb-Xvo--t4rITcs6oBX0paW5zHX4p50rQi0psUnTIBGigXjWO8c9NzukL-OnFGl8m1Wl2SiFX_trCv5D6pAA1yxSvYPyBZMgQDvgC88AWF4_jJ-d-qZfkFM86P-h44qVVM98x_9AFijN6eaQpRxv_Rzjrla6zmeFASYkWceJ5hC1Nc3elvjFW6KmLiet5ttJpht1MTO4UkisbByCkorKUK7xsE4mrfp-l2uMEs-nhVGvD-Vs7UudPdbZiZPXvNfEINl9qHvnjKIvSiDwd90APdJk3HVZQryFRiljEZkQyVDY5ZI6Qer9cnvtsiTWLfcX92Oz-XMNrZg3Ql-o8z1VR-_1YQDH6yE4TZYXy38crwqqcJWpQsAiyAdNt8BJ171Dd0M5W26NhVD68atyPJ-iyGaO_HcMFYGh6Re_ndaysgRqcjFMRluY0oNTKnClOaY0hJTiphSA1OaLanClGpM6-Tu5Xngdy2YTIiym4bg9-F0Qz3dUE_XPiG1xXIhTwnlQrhR7DmujGzelnZkC-kyYQvPHgsWsQa53sOsQW73_CIUIA-Ts72czslBKRoXpAYrIi_BDMzElYLqB-orXho
link.rule.ids	310,311,783,787,792,793,27939
linkProvider	IEEE
openUrl	ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=proceeding&rft.title=Proceedings+of+the+29th+annual+ACM%2FIEEE+international+symposium+on+Microarchitecture&rft.atitle=Profile-driven+instruction+level+parallel+scheduling+with+application+to+super+blocks&rft.au=Chekuri%2C+C.&rft.au=Johnson%2C+R.&rft.au=Motwani%2C+R.&rft.au=Natarajan%2C+B.&rft.series=ACM+Conferences&rft.date=1996-12-02&rft.pub=IEEE+Computer+Society&rft.isbn=9780818676413&rft.spage=58&rft.epage=67&rft_id=info:doi/10.5555%2F243846.243858
thumbnail_l	http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=9780818676413/lc.gif&client=summon&freeimage=true
thumbnail_m	http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=9780818676413/mc.gif&client=summon&freeimage=true
thumbnail_s	http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=9780818676413/sc.gif&client=summon&freeimage=true