Complexity/performance tradeoffs with non-blocking loads

Non-blocking loads are a very effective technique for tolerating the cache-miss latency on data cache references. The authors describe several methods for implementing non-blocking loads. A range of resulting hardware complexity/performance tradeoffs are investigated using an object-code translation...

Full description

Saved in:

Bibliographic Details
Published in	Proceedings of 21 International Symposium on Computer Architecture pp. 211 - 222
Main Authors	Farkas, K.I., Jouppi, N.P.
Format	Conference Proceeding
Language	English
Published	IEEE Comput. Soc. Press 1994
Subjects	Buffer storage Costs Delay Educational institutions Hardware Instruments Microprocessors Processor scheduling Registers
Online Access	Get full text

Cover

Loading…

Abstract	Non-blocking loads are a very effective technique for tolerating the cache-miss latency on data cache references. The authors describe several methods for implementing non-blocking loads. A range of resulting hardware complexity/performance tradeoffs are investigated using an object-code translation and instrumentation system. The authors investigate the SPEC92 benchmarks and have found that for the integer benchmarks, a simple hit-under-miss implementation achieves almost all of the available performance improvement for relatively little cost. However, for most of the numeric benchmarks, more expensive implementations are worthwhile. The results also point out the importance of using a compiler capable of scheduling load instructions for cache misses rather than cache hits in non-blocking systems.< >
AbstractList	Non-blocking loads are a very effective technique for tolerating the cache-miss latency on data cache references. The authors describe several methods for implementing non-blocking loads. A range of resulting hardware complexity/performance tradeoffs are investigated using an object-code translation and instrumentation system. The authors investigate the SPEC92 benchmarks and have found that for the integer benchmarks, a simple hit-under-miss implementation achieves almost all of the available performance improvement for relatively little cost. However, for most of the numeric benchmarks, more expensive implementations are worthwhile. The results also point out the importance of using a compiler capable of scheduling load instructions for cache misses rather than cache hits in non-blocking systems.< >
Author	Farkas, K.I. Jouppi, N.P.
Author_xml	– sequence: 1 givenname: K.I. surname: Farkas fullname: Farkas, K.I. organization: Dept. of Electr. & Comput. Eng., Toronto Univ., Ont., Canada – sequence: 2 givenname: N.P. surname: Jouppi fullname: Jouppi, N.P.
BookMark	eNp9jr0OgjAURpuoiX_sxqkvAPRqwXY0RKOz7qTCRavQkpZEeXtNdPZbznCG803J0FiDhCyARQBMxsdTto1ASh6thAAuBiSQG8EEiDRJgPExCby_s894AjKFCRGZbdoaX7rr4xZdZV2jTIG0c6pEW1WePnV3o59OeKlt8dDmSmurSj8no0rVHoMfZ2S5352zQ6gRMW-dbpTr8--N9V_5BlTyOFY
ContentType	Conference Proceeding
DBID	6IE 6IL CBEJK RIE RIL
DOI	10.1109/ISCA.1994.288148
DatabaseName	IEEE Electronic Library (IEL) Conference Proceedings IEEE Proceedings Order Plan All Online (POP All Online) 1998-present by volume IEEE Xplore All Conference Proceedings IEEE Electronic Library (IEL) IEEE Proceedings Order Plans (POP All) 1998-Present
DatabaseTitleList
Database_xml	– sequence: 1 dbid: RIE name: IEEE Electronic Library (IEL) url: https://proxy.k.utb.cz/login?url=https://ieeexplore.ieee.org/ sourceTypes: Publisher
DeliveryMethod	fulltext_linktorsrc
EndPage	222
ExternalDocumentID	288148
GroupedDBID	6IE 6IK 6IL AAJGR ACGHX ACM ADPZR ALMA_UNASSIGNED_HOLDINGS APO BEFXN BFFAM BGNUA BKEBE BPEOZ CBEJK GUFHI LHSKQ OCL RIB RIC RIE RIL
ID	FETCH-ieee_primary_2881483
IEDL.DBID	RIE
ISBN	9780818655104 0818655100
IngestDate	Wed Jun 26 19:26:26 EDT 2024
IsPeerReviewed	false
IsScholarly	false
Language	English
LinkModel	DirectLink
MergedId	FETCHMERGED-ieee_primary_2881483
ParticipantIDs	ieee_primary_288148
PublicationCentury	1900
PublicationDate	19940000
PublicationDateYYYYMMDD	1994-01-01
PublicationDate_xml	– year: 1994 text: 19940000
PublicationDecade	1990
PublicationTitle	Proceedings of 21 International Symposium on Computer Architecture
PublicationTitleAbbrev	ISCA
PublicationYear	1994
Publisher	IEEE Comput. Soc. Press
Publisher_xml	– name: IEEE Comput. Soc. Press
SSID	ssj0000451961
Score	2.4884477
Snippet	Non-blocking loads are a very effective technique for tolerating the cache-miss latency on data cache references. The authors describe several methods for...
SourceID	ieee
SourceType	Publisher
StartPage	211
SubjectTerms	Buffer storage Costs Delay Educational institutions Hardware Instruments Microprocessors Processor scheduling Registers
Title	Complexity/performance tradeoffs with non-blocking loads
URI	https://ieeexplore.ieee.org/document/288148
hasFullText	1
inHoldings	1
isFullTextHit
isPrint
link	http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjZ27T8MwEMZPtBMTFIqgPOSBNS9q8hhRRVUYEBIgdavs3GUBJVWTDO1fj88uqUAd2JJIsSM58fk-3_cLwC3TJhOTN3gYK-WZWTL3FJrJEDEJQzQvQZHbaouXePYhn-f38y1n23phiMgWn5HPh3YvH6u8ZaksuEtTs3rvQS_JMmfV6uQUi0mJI0t4jNhsGYVbolN3Ln92KcMseHqbPLBRT_quzV__VrGhZXrkPNu1JRJyRcmn3zbazzd_eI3_fOpjGO48fOK1i04DOKDyFFL-_BmB2ayD5c4yIJqVQqqKohasy4qyKj1tohzL6OKrUlgPYTR9fJ_MPO54sXSIioXrc3wGfXMHnYOIdEwoxynqBKVJIrRJXRKSkSrILIHC-AJO9jQw2nv1Eg4dUZiFiCvoN6uWrk1obvSNHZRvTRWPHg
link.rule.ids	310,311,783,787,792,793,799,4059,4060,27939,55088
linkProvider	IEEE
linkToHtml	http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjZ07T8MwEMdPUAaYeBVBeXlgzYu6SRhRRZVCqZAoUrfIzjkLKKmaZIBPj88uqUAd2BJLsS0lupfv_wvADdEmI503OBgK4WgrmTkCtTFEjHwf9UeQZ6bbYhomb_xxPpivONtGC6OUMs1nyqVLc5aPZdZQqcy7jWMdvW_DzoDCCivWagsqBpQSBobxGJDcMvBXTKf2nv-cU_p33vh1eE9SPe7aWX_9XcU4l9G-VW1XhklIPSXvblNLN_v6Q2z8574PoLtW8bGX1j8dwpYqjiEmA0AQzPrTW6xFA6xeClRlnleMKrOsKAtHaj9HhXT2UQqsutAbPcyGiUMLpwsLqUjtmv0T6Ogn1CmwQIYKeT9GGSHXaYTUyUukeCBypYMgPzyDow0T9DaOXsNuMnuepJPx9Okc9ixfmMoSF9Cpl4261I66llfmBX0D_5iSaw
openUrl	ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=proceeding&rft.title=Proceedings+of+21+International+Symposium+on+Computer+Architecture&rft.atitle=Complexity%2Fperformance+tradeoffs+with+non-blocking+loads&rft.au=Farkas%2C+K.I.&rft.au=Jouppi%2C+N.P.&rft.date=1994-01-01&rft.pub=IEEE+Comput.+Soc.+Press&rft.isbn=9780818655104&rft.spage=211&rft.epage=222&rft_id=info:doi/10.1109%2FISCA.1994.288148&rft.externalDocID=288148
thumbnail_l	http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=9780818655104/lc.gif&client=summon&freeimage=true
thumbnail_m	http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=9780818655104/mc.gif&client=summon&freeimage=true
thumbnail_s	http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=9780818655104/sc.gif&client=summon&freeimage=true