Reusing GEMM Hardware for Efficient Execution of Depthwise Separable Convolution on ASIC-Based DNN Accelerators
Deep learning (DL) accelerators are optimized for standard convolution. However, lightweight convolutional neural networks (CNNs) use depthwise convolution (DwC) in key layers, and the structural difference between DwC and standard convolution leads to significant performance bottleneck in executing...
Saved in:
Published in | 2023 28th Asia and South Pacific Design Automation Conference (ASP-DAC) pp. 475 - 482 |
---|---|
Main Authors | , , , , , , |
Format | Conference Proceeding |
Language | English |
Published |
New York, NY, USA
ACM
16.01.2023
|
Series | ACM Conferences |
Subjects | |
Online Access | Get full text |
Cover
Loading…
Be the first to leave a comment!