STAR-SRAM: 43.06-TFLOPS/W, 1.89-TFLOPS/mm2, 400-Kb/mm2 Floating-Point SRAM-Based Digital Computing-in-Memory Macro in 28-nm CMOS
A digital computing-in-memory (DCIM) macro gains increasing attention as a key building block in a deep neural network (DNN) accelerator. Recent macro designs pursue the improvement of three metrics, namely energy efficiency (EE), compute density (CD), and weight density (WD). Improvements in those...
Saved in:
Published in | 2024 IEEE Custom Integrated Circuits Conference (CICC) pp. 1 - 2 |
---|---|
Main Authors | , , , |
Format | Conference Proceeding |
Language | English |
Published |
IEEE
21.04.2024
|
Subjects | |
Online Access | Get full text |
Cover
Loading…
Summary: | A digital computing-in-memory (DCIM) macro gains increasing attention as a key building block in a deep neural network (DNN) accelerator. Recent macro designs pursue the improvement of three metrics, namely energy efficiency (EE), compute density (CD), and weight density (WD). Improvements in those metrics directly translate to reduced energy consumption, shortened inference latency, and shrunk silicon footprint at the accelerator level [1]-[7]. Specifically, defined as TFLOPS/W, the higher EE reduces the energy consumption of an accelerator for performing a given inference workload. Also, defined as \text{TFLOPS}/\text{mm}^{2} , the higher CD improves the latency of an accelerator for a given silicon area. Last but not least, the higher WD, defined as \text{Kb}/\text{mm}^{2} , reduces silicon area and cost for a given memory capacity requirement of an accelerator, typically set to the weight data size of the largest layer of a DNN model [1], [2]. |
---|---|
ISSN: | 2152-3630 |
DOI: | 10.1109/CICC60959.2024.10529048 |