Exact Scheduling to Minimize Off-Chip Data Movement for Deep Learning Accelerators

Specialized hardware accelerators are increasingly utilized to provide performance/power efficiency for Deep Neural Network (DNN) applications. However their benefits are limited by expensive off-chip data movement between host memory and the accelerator's on-chip scratchpad, which can consume...

Full description

Saved in:
Bibliographic Details
Published in2024 29th Asia and South Pacific Design Automation Conference (ASP-DAC) pp. 908 - 914
Main Authors Li, Yi, Gupta, Aarti, Malik, Sharad
Format Conference Proceeding
LanguageEnglish
Published IEEE 22.01.2024
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:Specialized hardware accelerators are increasingly utilized to provide performance/power efficiency for Deep Neural Network (DNN) applications. However their benefits are limited by expensive off-chip data movement between host memory and the accelerator's on-chip scratchpad, which can consume significantly more energy than accelerator computation [13]. While application-level DNN operators can have arbitrary sizes, accelerators typically support fixed-sized operations due to constrained on-chip memory and micro-architectures. Consequently, mapping an application-level operator to an accelerator involves decomposing it into loops of smaller tiles. Different choices of tile sizes, loop orders and memory partition across tensors result in a vast design space with huge differences in off-chip data movement volume. To address this challenge, we introduce Shoehorn, a schedule optimization framework that jointly optimizes loop tiling, loop ordering, and memory partitioning for mapping application-level DNN operators to hardware accelerators. Shoehorn can generate optimal schedules in subseconds and outperforms state-of-the-art approaches, reducing up to 51% total off-chip memory traffic relative to competing schedulers for several widely-used DNN applications on three distinct hardware accelerator targets.
ISSN:2153-697X
DOI:10.1109/ASP-DAC58780.2024.10473916