Stream: A Modeling Framework for Fine-grained Layer Fusion on Multi-core DNN Accelerators
To keep up with the ever-growing performance demand of DNN processing, specialized hardware (HW) accelerators are shifting towards multi-core architectures. Stream is the first open-source design space exploration (DSE) framework for co-optimization of HW architecture and fine-grained scheduling of...
Saved in:
Published in | 2023 IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS) pp. 355 - 357 |
---|---|
Main Authors | , , , , , |
Format | Conference Proceeding |
Language | English |
Published |
IEEE
01.04.2023
|
Subjects | |
Online Access | Get full text |
DOI | 10.1109/ISPASS57527.2023.00051 |
Cover
Abstract | To keep up with the ever-growing performance demand of DNN processing, specialized hardware (HW) accelerators are shifting towards multi-core architectures. Stream is the first open-source design space exploration (DSE) framework for co-optimization of HW architecture and fine-grained scheduling of such multi-core DNN accelerators. Stream supports finegrained layer fusion, to optimally trade-off energy, latency, and/or on-chip memory footprint for constrained edge devices. Validation against three SotA chips, together with a case study on seven HW architectures with different scheduling granularity, demonstrate the reliability and capabilities of Stream. Results show that high-level architectural decisions greatly impact HW efficiency under the fine-grained scheduling paradigm, reducing the energy-delay product from 2.4 \times for single-core architectures to up to 30 \times for heterogeneous multi-core architectures compared to traditional scheduling at layer granularity. Stream is open-source at github.com/ZigZag-Project/stream. |
---|---|
AbstractList | To keep up with the ever-growing performance demand of DNN processing, specialized hardware (HW) accelerators are shifting towards multi-core architectures. Stream is the first open-source design space exploration (DSE) framework for co-optimization of HW architecture and fine-grained scheduling of such multi-core DNN accelerators. Stream supports finegrained layer fusion, to optimally trade-off energy, latency, and/or on-chip memory footprint for constrained edge devices. Validation against three SotA chips, together with a case study on seven HW architectures with different scheduling granularity, demonstrate the reliability and capabilities of Stream. Results show that high-level architectural decisions greatly impact HW efficiency under the fine-grained scheduling paradigm, reducing the energy-delay product from 2.4 \times for single-core architectures to up to 30 \times for heterogeneous multi-core architectures compared to traditional scheduling at layer granularity. Stream is open-source at github.com/ZigZag-Project/stream. |
Author | Mei, Linyan Colleman, Steven Houshmand, Pouya Karl, Sebastian Symons, Arne Verhelst, Marian |
Author_xml | – sequence: 1 givenname: Arne surname: Symons fullname: Symons, Arne email: arne.symons@kuleuven.be organization: KU Leuven,Belgium – sequence: 2 givenname: Linyan surname: Mei fullname: Mei, Linyan organization: KU Leuven,Belgium – sequence: 3 givenname: Steven surname: Colleman fullname: Colleman, Steven organization: KU Leuven,Belgium – sequence: 4 givenname: Pouya surname: Houshmand fullname: Houshmand, Pouya organization: KU Leuven,Belgium – sequence: 5 givenname: Sebastian surname: Karl fullname: Karl, Sebastian organization: KU Leuven,Belgium – sequence: 6 givenname: Marian surname: Verhelst fullname: Verhelst, Marian organization: KU Leuven,Belgium |
BookMark | eNotjNFKwzAUQCPog879gUh-oDXpbZrEtzLtHHRTqD74NJLmdgTbRrIO2d9bUDhw4DycG3I5hhEJuecs5Zzph03zVjaNkCKTacYySBljgl-QpZZagWCgJWh2TT6bKaIZHmlJt8Fh78cDraIZ8CfEL9qFSCs_YnKIZpajtTnjnE5HH0Y6sz31k0_aEJE-7Xa0bFvsMZopxOMtuepMf8Tlvxfko3p-X70k9et6syrrxGcsnxLtWgsuV2iFU5ZLhQZknnUARjKtC7BO5ihsJ63QWDiZWQ3KsZzbwqkCYUHu_r4eEfff0Q8mnveccaE4B_gFaYBQPQ |
CODEN | IEEPAD |
ContentType | Conference Proceeding |
DBID | 6IE 6IL CBEJK RIE RIL |
DOI | 10.1109/ISPASS57527.2023.00051 |
DatabaseName | IEEE Electronic Library (IEL) Conference Proceedings IEEE Xplore POP ALL IEEE Xplore All Conference Proceedings IEEE Electronic Library (IEL) IEEE Proceedings Order Plans (POP All) 1998-Present |
DatabaseTitleList | |
Database_xml | – sequence: 1 dbid: RIE name: IEEE Electronic Library (IEL) url: https://proxy.k.utb.cz/login?url=https://ieeexplore.ieee.org/ sourceTypes: Publisher |
DeliveryMethod | fulltext_linktorsrc |
EISBN | 9798350397390 |
EndPage | 357 |
ExternalDocumentID | 10158113 |
Genre | orig-research |
GroupedDBID | 6IE 6IL CBEJK RIE RIL |
ID | FETCH-LOGICAL-i204t-9dcb3d48eb5d8b178ea3742f33a709963bd74e5bf7b59e6d72b938d041b6d86e3 |
IEDL.DBID | RIE |
IngestDate | Thu Jan 18 11:14:52 EST 2024 |
IsPeerReviewed | false |
IsScholarly | false |
Language | English |
LinkModel | DirectLink |
MergedId | FETCHMERGED-LOGICAL-i204t-9dcb3d48eb5d8b178ea3742f33a709963bd74e5bf7b59e6d72b938d041b6d86e3 |
PageCount | 3 |
ParticipantIDs | ieee_primary_10158113 |
PublicationCentury | 2000 |
PublicationDate | 2023-April |
PublicationDateYYYYMMDD | 2023-04-01 |
PublicationDate_xml | – month: 04 year: 2023 text: 2023-April |
PublicationDecade | 2020 |
PublicationTitle | 2023 IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS) |
PublicationTitleAbbrev | ISPASS |
PublicationYear | 2023 |
Publisher | IEEE |
Publisher_xml | – name: IEEE |
Score | 1.8656389 |
Snippet | To keep up with the ever-growing performance demand of DNN processing, specialized hardware (HW) accelerators are shifting towards multi-core architectures.... |
SourceID | ieee |
SourceType | Publisher |
StartPage | 355 |
SubjectTerms | accelerator Computer architecture design space exploration DNN layer fusion multi-core Performance analysis Reliability Service-oriented architecture Software Space exploration System-on-chip |
Title | Stream: A Modeling Framework for Fine-grained Layer Fusion on Multi-core DNN Accelerators |
URI | https://ieeexplore.ieee.org/document/10158113 |
hasFullText | 1 |
inHoldings | 1 |
isFullTextHit | |
isPrint | |
link | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV3PS8MwGA26kycVJ_4mB6-pTZMmqbehlilaBnMwTyM_voqIm2h78a83STdFQRB6CL20JC3vJd9730PoVGVW0FoykltWE-4ASFFDQXRNbaGoNDyqKu8qMZzwm2k-XZrVoxcGAKL4DJIwjLV8t7BtOCrzfzjNFQ0Ztev-O-vMWkvXL02Ls-vxaDAee_6RySSkgieRoPyITYmoUW6iavW8TizynLSNSezHr1aM_36hLdT_Nujh0Rf0bKM1mO-gh1Bh1i_neIBDwlnwmeNypb3Cnpzi0lNK8hhCIcDhW-3ZNi7bcFyG_RWtuCR0tcSXVYUH1npEikX49z6alFf3F0OyTE4gT1nKG1I4a5jjCkzulKFSgWZ-D1wzpqWnhIIZJznkppYmL0A4mZmCKZdyaoRTAtgu6s0Xc9hD2HKrC55qZcBw6YQWwoO-za1mVAju9lE_zMvstWuOMVtNycEf9w_RRlibTvxyhHrNWwvHHtcbcxLX8xONGqQM |
linkProvider | IEEE |
linkToHtml | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV3PS8MwGA2iBz2pOPG3OXhNbZY0Sb0NtWy6lcE2mKeRH19FxE10u_jXm6SboiAIPYReGhLCe8333vcQulBNK2glGcksqwh3ACSvICe6ojZXVBoeVZW9UrRH_G6cjZdm9eiFAYAoPoMkDGMt383sIlyV-RNOM0VDRu2GB36e1Xatpe-XpvllZ9BvDQaegTRlEnLBk0hRfgSnRNwotlG5-mItF3lOFnOT2I9fzRj_PaUd1Pi26OH-F_jsojWY7qGHUGPWL1e4hUPGWXCa42KlvsKenuLCk0ryGGIhwOGu9nwbF4twYYb9E824JPS1xDdliVvWekyKZfj3BhoVt8PrNllmJ5CnZsrnJHfWMMcVmMwpQ6UCzfxfcMWYlp4UCmac5JCZSposB-Fk0-RMuZRTI5wSwPbR-nQ2hQOELbc656lWBgyXTmghPOzbzGpGheDuEDXCukxe6_YYk9WSHP3x_hxttoe97qTbKe-P0VbYp1oKc4LW528LOPUoPzdncW8_AVRAp1k |
openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=proceeding&rft.title=2023+IEEE+International+Symposium+on+Performance+Analysis+of+Systems+and+Software+%28ISPASS%29&rft.atitle=Stream%3A+A+Modeling+Framework+for+Fine-grained+Layer+Fusion+on+Multi-core+DNN+Accelerators&rft.au=Symons%2C+Arne&rft.au=Mei%2C+Linyan&rft.au=Colleman%2C+Steven&rft.au=Houshmand%2C+Pouya&rft.date=2023-04-01&rft.pub=IEEE&rft.spage=355&rft.epage=357&rft_id=info:doi/10.1109%2FISPASS57527.2023.00051&rft.externalDocID=10158113 |