A 28-nm 0.8M-weights/mm2 9.1-TOPS/mm2 SRAM-Based All-Analog Compute-In-Memory Using Fine-Grained Structured Pruning with Adaptive-Ranging ADC

A 0.8M-weights / \mathrm{mm}^{2} 9.1-TOPS/mm 2 SRAMbased all-analog CIM for AI edge devices has been developed in 28-nm CMOS. This paper presents the first CIM for both efficient storing and processing of sparse weight matrices using fine-grained structured pruning, achieving 5.5x improvement in bot...

Full description

Saved in:
Bibliographic Details
Published in2024 IEEE European Solid-State Electronics Research Conference (ESSERC) pp. 365 - 368
Main Authors Shiba, Kota, Zhan, Zhijie, Nii, Koji, Wang, Yih, Chang, Tsung-Yung Jonathan, Kosuge, Atsutake, Hamada, Mototsugu, Kuroda, Tadahiro
Format Conference Proceeding
LanguageEnglish
Published IEEE 09.09.2024
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:A 0.8M-weights / \mathrm{mm}^{2} 9.1-TOPS/mm 2 SRAMbased all-analog CIM for AI edge devices has been developed in 28-nm CMOS. This paper presents the first CIM for both efficient storing and processing of sparse weight matrices using fine-grained structured pruning, achieving 5.5x improvement in both memory and compute density. All-analog operations from MAC, biasing, ReLU to layer-to-layer quantization achieve high system-level energy efficiency and low latency. They are enabled by a PVT-tracking adaptive-ranging ADC with sensing margin improved by up to 576x. The proposed CIM achieves >6x higher memory and compute density than other CIMs with 116 TOPS/W system-level energy efficiency and low latency.
ISSN:2643-1319
DOI:10.1109/ESSERC62670.2024.10719563