STAR-SRAM: 43.06-TFLOPS/W, 1.89-TFLOPS/mm2, 400-Kb/mm2 Floating-Point SRAM-Based Digital Computing-in-Memory Macro in 28-nm CMOS

A digital computing-in-memory (DCIM) macro gains increasing attention as a key building block in a deep neural network (DNN) accelerator. Recent macro designs pursue the improvement of three metrics, namely energy efficiency (EE), compute density (CD), and weight density (WD). Improvements in those...

Full description

Saved in:

Bibliographic Details
Published in	2024 IEEE Custom Integrated Circuits Conference (CICC) pp. 1 - 2
Main Authors	Lin, Chuan-Tung, Oh, Jonghyun, Lee, Kevin, Seok, Mingoo
Format	Conference Proceeding
Language	English
Published	IEEE 21.04.2024
Subjects	Artificial neural networks Costs Energy consumption Measurement Memory management Semiconductor device modeling Silicon
Online Access	Get full text

Cover

Loading…

More Information
Summary:	A digital computing-in-memory (DCIM) macro gains increasing attention as a key building block in a deep neural network (DNN) accelerator. Recent macro designs pursue the improvement of three metrics, namely energy efficiency (EE), compute density (CD), and weight density (WD). Improvements in those metrics directly translate to reduced energy consumption, shortened inference latency, and shrunk silicon footprint at the accelerator level [1]-[7]. Specifically, defined as TFLOPS/W, the higher EE reduces the energy consumption of an accelerator for performing a given inference workload. Also, defined as \text{TFLOPS}/\text{mm}^{2} , the higher CD improves the latency of an accelerator for a given silicon area. Last but not least, the higher WD, defined as \text{Kb}/\text{mm}^{2} , reduces silicon area and cost for a given memory capacity requirement of an accelerator, typically set to the weight data size of the largest layer of a DNN model [1], [2].
ISSN:	2152-3630
DOI:	10.1109/CICC60959.2024.10529048