Stream: A Modeling Framework for Fine-grained Layer Fusion on Multi-core DNN Accelerators

To keep up with the ever-growing performance demand of DNN processing, specialized hardware (HW) accelerators are shifting towards multi-core architectures. Stream is the first open-source design space exploration (DSE) framework for co-optimization of HW architecture and fine-grained scheduling of...

Full description

Saved in:

Bibliographic Details
Published in	2023 IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS) pp. 355 - 357
Main Authors	Symons, Arne, Mei, Linyan, Colleman, Steven, Houshmand, Pouya, Karl, Sebastian, Verhelst, Marian
Format	Conference Proceeding
Language	English
Published	IEEE 01.04.2023
Subjects	accelerator Computer architecture design space exploration DNN layer fusion multi-core Performance analysis Reliability Service-oriented architecture Software Space exploration System-on-chip
Online Access	Get full text
DOI	10.1109/ISPASS57527.2023.00051

Cover

Abstract	To keep up with the ever-growing performance demand of DNN processing, specialized hardware (HW) accelerators are shifting towards multi-core architectures. Stream is the first open-source design space exploration (DSE) framework for co-optimization of HW architecture and fine-grained scheduling of such multi-core DNN accelerators. Stream supports finegrained layer fusion, to optimally trade-off energy, latency, and/or on-chip memory footprint for constrained edge devices. Validation against three SotA chips, together with a case study on seven HW architectures with different scheduling granularity, demonstrate the reliability and capabilities of Stream. Results show that high-level architectural decisions greatly impact HW efficiency under the fine-grained scheduling paradigm, reducing the energy-delay product from 2.4 \times for single-core architectures to up to 30 \times for heterogeneous multi-core architectures compared to traditional scheduling at layer granularity. Stream is open-source at github.com/ZigZag-Project/stream.
AbstractList	To keep up with the ever-growing performance demand of DNN processing, specialized hardware (HW) accelerators are shifting towards multi-core architectures. Stream is the first open-source design space exploration (DSE) framework for co-optimization of HW architecture and fine-grained scheduling of such multi-core DNN accelerators. Stream supports finegrained layer fusion, to optimally trade-off energy, latency, and/or on-chip memory footprint for constrained edge devices. Validation against three SotA chips, together with a case study on seven HW architectures with different scheduling granularity, demonstrate the reliability and capabilities of Stream. Results show that high-level architectural decisions greatly impact HW efficiency under the fine-grained scheduling paradigm, reducing the energy-delay product from 2.4 \times for single-core architectures to up to 30 \times for heterogeneous multi-core architectures compared to traditional scheduling at layer granularity. Stream is open-source at github.com/ZigZag-Project/stream.
Author	Mei, Linyan Colleman, Steven Houshmand, Pouya Karl, Sebastian Symons, Arne Verhelst, Marian
Author_xml	– sequence: 1 givenname: Arne surname: Symons fullname: Symons, Arne email: arne.symons@kuleuven.be organization: KU Leuven,Belgium – sequence: 2 givenname: Linyan surname: Mei fullname: Mei, Linyan organization: KU Leuven,Belgium – sequence: 3 givenname: Steven surname: Colleman fullname: Colleman, Steven organization: KU Leuven,Belgium – sequence: 4 givenname: Pouya surname: Houshmand fullname: Houshmand, Pouya organization: KU Leuven,Belgium – sequence: 5 givenname: Sebastian surname: Karl fullname: Karl, Sebastian organization: KU Leuven,Belgium – sequence: 6 givenname: Marian surname: Verhelst fullname: Verhelst, Marian organization: KU Leuven,Belgium
BookMark	eNotjNFKwzAUQCPog879gUh-oDXpbZrEtzLtHHRTqD74NJLmdgTbRrIO2d9bUDhw4DycG3I5hhEJuecs5Zzph03zVjaNkCKTacYySBljgl-QpZZagWCgJWh2TT6bKaIZHmlJt8Fh78cDraIZ8CfEL9qFSCs_YnKIZpajtTnjnE5HH0Y6sz31k0_aEJE-7Xa0bFvsMZopxOMtuepMf8Tlvxfko3p-X70k9et6syrrxGcsnxLtWgsuV2iFU5ZLhQZknnUARjKtC7BO5ihsJ63QWDiZWQ3KsZzbwqkCYUHu_r4eEfff0Q8mnveccaE4B_gFaYBQPQ
CODEN	IEEPAD
ContentType	Conference Proceeding
DBID	6IE 6IL CBEJK RIE RIL
DOI	10.1109/ISPASS57527.2023.00051
DatabaseName	IEEE Electronic Library (IEL) Conference Proceedings IEEE Xplore POP ALL IEEE Xplore All Conference Proceedings IEEE Electronic Library (IEL) IEEE Proceedings Order Plans (POP All) 1998-Present
DatabaseTitleList
Database_xml	– sequence: 1 dbid: RIE name: IEEE Electronic Library (IEL) url: https://proxy.k.utb.cz/login?url=https://ieeexplore.ieee.org/ sourceTypes: Publisher
DeliveryMethod	fulltext_linktorsrc
EISBN	9798350397390
EndPage	357
ExternalDocumentID	10158113
Genre	orig-research
GroupedDBID	6IE 6IL CBEJK RIE RIL
ID	FETCH-LOGICAL-i204t-9dcb3d48eb5d8b178ea3742f33a709963bd74e5bf7b59e6d72b938d041b6d86e3
IEDL.DBID	RIE
IngestDate	Thu Jan 18 11:14:52 EST 2024
IsPeerReviewed	false
IsScholarly	false
Language	English
LinkModel	DirectLink
MergedId	FETCHMERGED-LOGICAL-i204t-9dcb3d48eb5d8b178ea3742f33a709963bd74e5bf7b59e6d72b938d041b6d86e3
PageCount	3
ParticipantIDs	ieee_primary_10158113
PublicationCentury	2000
PublicationDate	2023-April
PublicationDateYYYYMMDD	2023-04-01
PublicationDate_xml	– month: 04 year: 2023 text: 2023-April
PublicationDecade	2020
PublicationTitle	2023 IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS)
PublicationTitleAbbrev	ISPASS
PublicationYear	2023
Publisher	IEEE
Publisher_xml	– name: IEEE
Score	1.8656389
Snippet	To keep up with the ever-growing performance demand of DNN processing, specialized hardware (HW) accelerators are shifting towards multi-core architectures....
SourceID	ieee
SourceType	Publisher
StartPage	355
SubjectTerms	accelerator Computer architecture design space exploration DNN layer fusion multi-core Performance analysis Reliability Service-oriented architecture Software Space exploration System-on-chip
Title	Stream: A Modeling Framework for Fine-grained Layer Fusion on Multi-core DNN Accelerators
URI	https://ieeexplore.ieee.org/document/10158113
hasFullText	1
inHoldings	1
isFullTextHit
isPrint
link	http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV3PS8MwGA26kycVJ_4mB6-pTZMmqbehlilaBnMwTyM_voqIm2h78a83STdFQRB6CL20JC3vJd9730PoVGVW0FoykltWE-4ASFFDQXRNbaGoNDyqKu8qMZzwm2k-XZrVoxcGAKL4DJIwjLV8t7BtOCrzfzjNFQ0Ztev-O-vMWkvXL02Ls-vxaDAee_6RySSkgieRoPyITYmoUW6iavW8TizynLSNSezHr1aM_36hLdT_Nujh0Rf0bKM1mO-gh1Bh1i_neIBDwlnwmeNypb3Cnpzi0lNK8hhCIcDhW-3ZNi7bcFyG_RWtuCR0tcSXVYUH1npEikX49z6alFf3F0OyTE4gT1nKG1I4a5jjCkzulKFSgWZ-D1wzpqWnhIIZJznkppYmL0A4mZmCKZdyaoRTAtgu6s0Xc9hD2HKrC55qZcBw6YQWwoO-za1mVAju9lE_zMvstWuOMVtNycEf9w_RRlibTvxyhHrNWwvHHtcbcxLX8xONGqQM
linkProvider	IEEE
linkToHtml	http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV3PS8MwGA2iBz2pOPG3OXhNbZY0Sb0NtWy6lcE2mKeRH19FxE10u_jXm6SboiAIPYReGhLCe8333vcQulBNK2glGcksqwh3ACSvICe6ojZXVBoeVZW9UrRH_G6cjZdm9eiFAYAoPoMkDGMt383sIlyV-RNOM0VDRu2GB36e1Xatpe-XpvllZ9BvDQaegTRlEnLBk0hRfgSnRNwotlG5-mItF3lOFnOT2I9fzRj_PaUd1Pi26OH-F_jsojWY7qGHUGPWL1e4hUPGWXCa42KlvsKenuLCk0ryGGIhwOGu9nwbF4twYYb9E824JPS1xDdliVvWekyKZfj3BhoVt8PrNllmJ5CnZsrnJHfWMMcVmMwpQ6UCzfxfcMWYlp4UCmac5JCZSposB-Fk0-RMuZRTI5wSwPbR-nQ2hQOELbc656lWBgyXTmghPOzbzGpGheDuEDXCukxe6_YYk9WSHP3x_hxttoe97qTbKe-P0VbYp1oKc4LW528LOPUoPzdncW8_AVRAp1k
openUrl	ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=proceeding&rft.title=2023+IEEE+International+Symposium+on+Performance+Analysis+of+Systems+and+Software+%28ISPASS%29&rft.atitle=Stream%3A+A+Modeling+Framework+for+Fine-grained+Layer+Fusion+on+Multi-core+DNN+Accelerators&rft.au=Symons%2C+Arne&rft.au=Mei%2C+Linyan&rft.au=Colleman%2C+Steven&rft.au=Houshmand%2C+Pouya&rft.date=2023-04-01&rft.pub=IEEE&rft.spage=355&rft.epage=357&rft_id=info:doi/10.1109%2FISPASS57527.2023.00051&rft.externalDocID=10158113