Stream: A Modeling Framework for Fine-grained Layer Fusion on Multi-core DNN Accelerators

To keep up with the ever-growing performance demand of DNN processing, specialized hardware (HW) accelerators are shifting towards multi-core architectures. Stream is the first open-source design space exploration (DSE) framework for co-optimization of HW architecture and fine-grained scheduling of...

Full description

Saved in:
Bibliographic Details
Published in2023 IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS) pp. 355 - 357
Main Authors Symons, Arne, Mei, Linyan, Colleman, Steven, Houshmand, Pouya, Karl, Sebastian, Verhelst, Marian
Format Conference Proceeding
LanguageEnglish
Published IEEE 01.04.2023
Subjects
Online AccessGet full text
DOI10.1109/ISPASS57527.2023.00051

Cover

Abstract To keep up with the ever-growing performance demand of DNN processing, specialized hardware (HW) accelerators are shifting towards multi-core architectures. Stream is the first open-source design space exploration (DSE) framework for co-optimization of HW architecture and fine-grained scheduling of such multi-core DNN accelerators. Stream supports finegrained layer fusion, to optimally trade-off energy, latency, and/or on-chip memory footprint for constrained edge devices. Validation against three SotA chips, together with a case study on seven HW architectures with different scheduling granularity, demonstrate the reliability and capabilities of Stream. Results show that high-level architectural decisions greatly impact HW efficiency under the fine-grained scheduling paradigm, reducing the energy-delay product from 2.4 \times for single-core architectures to up to 30 \times for heterogeneous multi-core architectures compared to traditional scheduling at layer granularity. Stream is open-source at github.com/ZigZag-Project/stream.
AbstractList To keep up with the ever-growing performance demand of DNN processing, specialized hardware (HW) accelerators are shifting towards multi-core architectures. Stream is the first open-source design space exploration (DSE) framework for co-optimization of HW architecture and fine-grained scheduling of such multi-core DNN accelerators. Stream supports finegrained layer fusion, to optimally trade-off energy, latency, and/or on-chip memory footprint for constrained edge devices. Validation against three SotA chips, together with a case study on seven HW architectures with different scheduling granularity, demonstrate the reliability and capabilities of Stream. Results show that high-level architectural decisions greatly impact HW efficiency under the fine-grained scheduling paradigm, reducing the energy-delay product from 2.4 \times for single-core architectures to up to 30 \times for heterogeneous multi-core architectures compared to traditional scheduling at layer granularity. Stream is open-source at github.com/ZigZag-Project/stream.
Author Mei, Linyan
Colleman, Steven
Houshmand, Pouya
Karl, Sebastian
Symons, Arne
Verhelst, Marian
Author_xml – sequence: 1
  givenname: Arne
  surname: Symons
  fullname: Symons, Arne
  email: arne.symons@kuleuven.be
  organization: KU Leuven,Belgium
– sequence: 2
  givenname: Linyan
  surname: Mei
  fullname: Mei, Linyan
  organization: KU Leuven,Belgium
– sequence: 3
  givenname: Steven
  surname: Colleman
  fullname: Colleman, Steven
  organization: KU Leuven,Belgium
– sequence: 4
  givenname: Pouya
  surname: Houshmand
  fullname: Houshmand, Pouya
  organization: KU Leuven,Belgium
– sequence: 5
  givenname: Sebastian
  surname: Karl
  fullname: Karl, Sebastian
  organization: KU Leuven,Belgium
– sequence: 6
  givenname: Marian
  surname: Verhelst
  fullname: Verhelst, Marian
  organization: KU Leuven,Belgium
BookMark eNotjNFKwzAUQCPog879gUh-oDXpbZrEtzLtHHRTqD74NJLmdgTbRrIO2d9bUDhw4DycG3I5hhEJuecs5Zzph03zVjaNkCKTacYySBljgl-QpZZagWCgJWh2TT6bKaIZHmlJt8Fh78cDraIZ8CfEL9qFSCs_YnKIZpajtTnjnE5HH0Y6sz31k0_aEJE-7Xa0bFvsMZopxOMtuepMf8Tlvxfko3p-X70k9et6syrrxGcsnxLtWgsuV2iFU5ZLhQZknnUARjKtC7BO5ihsJ63QWDiZWQ3KsZzbwqkCYUHu_r4eEfff0Q8mnveccaE4B_gFaYBQPQ
CODEN IEEPAD
ContentType Conference Proceeding
DBID 6IE
6IL
CBEJK
RIE
RIL
DOI 10.1109/ISPASS57527.2023.00051
DatabaseName IEEE Electronic Library (IEL) Conference Proceedings
IEEE Xplore POP ALL
IEEE Xplore All Conference Proceedings
IEEE Electronic Library (IEL)
IEEE Proceedings Order Plans (POP All) 1998-Present
DatabaseTitleList
Database_xml – sequence: 1
  dbid: RIE
  name: IEEE Electronic Library (IEL)
  url: https://proxy.k.utb.cz/login?url=https://ieeexplore.ieee.org/
  sourceTypes: Publisher
DeliveryMethod fulltext_linktorsrc
EISBN 9798350397390
EndPage 357
ExternalDocumentID 10158113
Genre orig-research
GroupedDBID 6IE
6IL
CBEJK
RIE
RIL
ID FETCH-LOGICAL-i204t-9dcb3d48eb5d8b178ea3742f33a709963bd74e5bf7b59e6d72b938d041b6d86e3
IEDL.DBID RIE
IngestDate Thu Jan 18 11:14:52 EST 2024
IsPeerReviewed false
IsScholarly false
Language English
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-i204t-9dcb3d48eb5d8b178ea3742f33a709963bd74e5bf7b59e6d72b938d041b6d86e3
PageCount 3
ParticipantIDs ieee_primary_10158113
PublicationCentury 2000
PublicationDate 2023-April
PublicationDateYYYYMMDD 2023-04-01
PublicationDate_xml – month: 04
  year: 2023
  text: 2023-April
PublicationDecade 2020
PublicationTitle 2023 IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS)
PublicationTitleAbbrev ISPASS
PublicationYear 2023
Publisher IEEE
Publisher_xml – name: IEEE
Score 1.8656389
Snippet To keep up with the ever-growing performance demand of DNN processing, specialized hardware (HW) accelerators are shifting towards multi-core architectures....
SourceID ieee
SourceType Publisher
StartPage 355
SubjectTerms accelerator
Computer architecture
design space exploration
DNN
layer fusion
multi-core
Performance analysis
Reliability
Service-oriented architecture
Software
Space exploration
System-on-chip
Title Stream: A Modeling Framework for Fine-grained Layer Fusion on Multi-core DNN Accelerators
URI https://ieeexplore.ieee.org/document/10158113
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
link http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV3PS8MwGA26kycVJ_4mB6-pTZMmqbehlilaBnMwTyM_voqIm2h78a83STdFQRB6CL20JC3vJd9730PoVGVW0FoykltWE-4ASFFDQXRNbaGoNDyqKu8qMZzwm2k-XZrVoxcGAKL4DJIwjLV8t7BtOCrzfzjNFQ0Ztev-O-vMWkvXL02Ls-vxaDAee_6RySSkgieRoPyITYmoUW6iavW8TizynLSNSezHr1aM_36hLdT_Nujh0Rf0bKM1mO-gh1Bh1i_neIBDwlnwmeNypb3Cnpzi0lNK8hhCIcDhW-3ZNi7bcFyG_RWtuCR0tcSXVYUH1npEikX49z6alFf3F0OyTE4gT1nKG1I4a5jjCkzulKFSgWZ-D1wzpqWnhIIZJznkppYmL0A4mZmCKZdyaoRTAtgu6s0Xc9hD2HKrC55qZcBw6YQWwoO-za1mVAju9lE_zMvstWuOMVtNycEf9w_RRlibTvxyhHrNWwvHHtcbcxLX8xONGqQM
linkProvider IEEE
linkToHtml http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV3PS8MwGA2iBz2pOPG3OXhNbZY0Sb0NtWy6lcE2mKeRH19FxE10u_jXm6SboiAIPYReGhLCe8333vcQulBNK2glGcksqwh3ACSvICe6ojZXVBoeVZW9UrRH_G6cjZdm9eiFAYAoPoMkDGMt383sIlyV-RNOM0VDRu2GB36e1Xatpe-XpvllZ9BvDQaegTRlEnLBk0hRfgSnRNwotlG5-mItF3lOFnOT2I9fzRj_PaUd1Pi26OH-F_jsojWY7qGHUGPWL1e4hUPGWXCa42KlvsKenuLCk0ryGGIhwOGu9nwbF4twYYb9E824JPS1xDdliVvWekyKZfj3BhoVt8PrNllmJ5CnZsrnJHfWMMcVmMwpQ6UCzfxfcMWYlp4UCmac5JCZSposB-Fk0-RMuZRTI5wSwPbR-nQ2hQOELbc656lWBgyXTmghPOzbzGpGheDuEDXCukxe6_YYk9WSHP3x_hxttoe97qTbKe-P0VbYp1oKc4LW528LOPUoPzdncW8_AVRAp1k
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=proceeding&rft.title=2023+IEEE+International+Symposium+on+Performance+Analysis+of+Systems+and+Software+%28ISPASS%29&rft.atitle=Stream%3A+A+Modeling+Framework+for+Fine-grained+Layer+Fusion+on+Multi-core+DNN+Accelerators&rft.au=Symons%2C+Arne&rft.au=Mei%2C+Linyan&rft.au=Colleman%2C+Steven&rft.au=Houshmand%2C+Pouya&rft.date=2023-04-01&rft.pub=IEEE&rft.spage=355&rft.epage=357&rft_id=info:doi/10.1109%2FISPASS57527.2023.00051&rft.externalDocID=10158113