Reusing GEMM Hardware for Efficient Execution of Depthwise Separable Convolution on ASIC-Based DNN Accelerators

Deep learning (DL) accelerators are optimized for standard convolution. However, lightweight convolutional neural networks (CNNs) use depthwise convolution (DwC) in key layers, and the structural difference between DwC and standard convolution leads to significant performance bottleneck in executing...

Full description

Saved in:
Bibliographic Details
Published in2023 28th Asia and South Pacific Design Automation Conference (ASP-DAC) pp. 475 - 482
Main Authors Manasi, Susmita Dey, Banerjee, Suvadeep, Davare, Abhijit, Sorokin, Anton A., Burns, Steven M., Kirkpatrick, Desmond A., Sapatnekar, Sachin S.
Format Conference Proceeding
LanguageEnglish
Published New York, NY, USA ACM 16.01.2023
SeriesACM Conferences
Subjects
Online AccessGet full text

Cover

Loading…