STAR-SRAM: 43.06-TFLOPS/W, 1.89-TFLOPS/mm2, 400-Kb/mm2 Floating-Point SRAM-Based Digital Computing-in-Memory Macro in 28-nm CMOS

A digital computing-in-memory (DCIM) macro gains increasing attention as a key building block in a deep neural network (DNN) accelerator. Recent macro designs pursue the improvement of three metrics, namely energy efficiency (EE), compute density (CD), and weight density (WD). Improvements in those...

Full description

Saved in:
Bibliographic Details
Published in2024 IEEE Custom Integrated Circuits Conference (CICC) pp. 1 - 2
Main Authors Lin, Chuan-Tung, Oh, Jonghyun, Lee, Kevin, Seok, Mingoo
Format Conference Proceeding
LanguageEnglish
Published IEEE 21.04.2024
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:A digital computing-in-memory (DCIM) macro gains increasing attention as a key building block in a deep neural network (DNN) accelerator. Recent macro designs pursue the improvement of three metrics, namely energy efficiency (EE), compute density (CD), and weight density (WD). Improvements in those metrics directly translate to reduced energy consumption, shortened inference latency, and shrunk silicon footprint at the accelerator level [1]-[7]. Specifically, defined as TFLOPS/W, the higher EE reduces the energy consumption of an accelerator for performing a given inference workload. Also, defined as \text{TFLOPS}/\text{mm}^{2} , the higher CD improves the latency of an accelerator for a given silicon area. Last but not least, the higher WD, defined as \text{Kb}/\text{mm}^{2} , reduces silicon area and cost for a given memory capacity requirement of an accelerator, typically set to the weight data size of the largest layer of a DNN model [1], [2].
ISSN:2152-3630
DOI:10.1109/CICC60959.2024.10529048