A 28-nm 0.8M-weights/mm2 9.1-TOPS/mm2 SRAM-Based All-Analog Compute-In-Memory Using Fine-Grained Structured Pruning with Adaptive-Ranging ADC
A 0.8M-weights / \mathrm{mm}^{2} 9.1-TOPS/mm 2 SRAMbased all-analog CIM for AI edge devices has been developed in 28-nm CMOS. This paper presents the first CIM for both efficient storing and processing of sparse weight matrices using fine-grained structured pruning, achieving 5.5x improvement in bot...
Saved in:
Published in | 2024 IEEE European Solid-State Electronics Research Conference (ESSERC) pp. 365 - 368 |
---|---|
Main Authors | , , , , , , , |
Format | Conference Proceeding |
Language | English |
Published |
IEEE
09.09.2024
|
Subjects | |
Online Access | Get full text |
Cover
Loading…
Summary: | A 0.8M-weights / \mathrm{mm}^{2} 9.1-TOPS/mm 2 SRAMbased all-analog CIM for AI edge devices has been developed in 28-nm CMOS. This paper presents the first CIM for both efficient storing and processing of sparse weight matrices using fine-grained structured pruning, achieving 5.5x improvement in both memory and compute density. All-analog operations from MAC, biasing, ReLU to layer-to-layer quantization achieve high system-level energy efficiency and low latency. They are enabled by a PVT-tracking adaptive-ranging ADC with sensing margin improved by up to 576x. The proposed CIM achieves >6x higher memory and compute density than other CIMs with 116 TOPS/W system-level energy efficiency and low latency. |
---|---|
ISSN: | 2643-1319 |
DOI: | 10.1109/ESSERC62670.2024.10719563 |