Stream: A Modeling Framework for Fine-grained Layer Fusion on Multi-core DNN Accelerators
To keep up with the ever-growing performance demand of DNN processing, specialized hardware (HW) accelerators are shifting towards multi-core architectures. Stream is the first open-source design space exploration (DSE) framework for co-optimization of HW architecture and fine-grained scheduling of...
Saved in:
Published in | 2023 IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS) pp. 355 - 357 |
---|---|
Main Authors | , , , , , |
Format | Conference Proceeding |
Language | English |
Published |
IEEE
01.04.2023
|
Subjects | |
Online Access | Get full text |
DOI | 10.1109/ISPASS57527.2023.00051 |
Cover
Loading…
Summary: | To keep up with the ever-growing performance demand of DNN processing, specialized hardware (HW) accelerators are shifting towards multi-core architectures. Stream is the first open-source design space exploration (DSE) framework for co-optimization of HW architecture and fine-grained scheduling of such multi-core DNN accelerators. Stream supports finegrained layer fusion, to optimally trade-off energy, latency, and/or on-chip memory footprint for constrained edge devices. Validation against three SotA chips, together with a case study on seven HW architectures with different scheduling granularity, demonstrate the reliability and capabilities of Stream. Results show that high-level architectural decisions greatly impact HW efficiency under the fine-grained scheduling paradigm, reducing the energy-delay product from 2.4 \times for single-core architectures to up to 30 \times for heterogeneous multi-core architectures compared to traditional scheduling at layer granularity. Stream is open-source at github.com/ZigZag-Project/stream. |
---|---|
DOI: | 10.1109/ISPASS57527.2023.00051 |