A Heuristic and Greedy Weight Remapping Scheme with Hardware Optimization for Irregular Sparse Neural Networks Implemented on CIM Accelerator in Edge AI Applications

Computing-in-memory (CIM) is a promising technique for hardware acceleration of neural networks (NNs) with high performance and efficiency. However, conventional dense mapping scheme cannot well support the compression and optimization of irregular sparse NNs. In this paper, we propose a heuristic a...

Full description

Saved in:
Bibliographic Details
Published in2024 29th Asia and South Pacific Design Automation Conference (ASP-DAC) pp. 551 - 556
Main Authors Wu, Lizhou, Zhao, Chenyang, Wang, Jingbo, Yu, Xueru, Chen, Shoumian, Li, Chen, Han, Jun, Xue, Xiaoyong, Zeng, Xiaoyang
Format Conference Proceeding
LanguageEnglish
Published IEEE 22.01.2024
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:Computing-in-memory (CIM) is a promising technique for hardware acceleration of neural networks (NNs) with high performance and efficiency. However, conventional dense mapping scheme cannot well support the compression and optimization of irregular sparse NNs. In this paper, we propose a heuristic and greedy weight remapping scheme for irregular sparse neural networks implemented on CIM accelerator in edge AI applications. The genetic algorithm (GA) is proposed for the first time to be utilized in the column shuffle for sparse weight remapping. Combined with the granularity exploration of the CIM, the proportion of the compressible all-zero rows increase remarkably. A greedy algorithm is then employed to planarize the unevenly compressed units, thus to improve the storage utilization of the crossbar. For hardware optimization, the pipeline is customized with a zero-skipping circuit to leverage the bit-level activation sparsity at runtime. Our results show that the proposed remapping scheme achieves 70%-94% utilization rate of the sparsity, and an average of 1.3 \times increment compared with the naive compression. The cooptimized CIM achieves 3-7.6 \times speedup and 2.1- 4.8 \times energy efficiency, compared with the baseline for dense NNs.
ISSN:2153-697X
DOI:10.1109/ASP-DAC58780.2024.10473919