A 8-b-Precision 6T SRAM Computing-in-Memory Macro Using Segmented-Bitline Charge-Sharing Scheme for AI Edge Chips

Advances in static random access memory (SRAM)-CIM devices are meant to increase capacity while improving energy efficiency (EF) and reducing computing latency (<inline-formula> <tex-math notation="LaTeX">T_{\mathrm {AC}} </tex-math></inline-formula>). This work pre...

Full description

Saved in:
Bibliographic Details
Published inIEEE journal of solid-state circuits Vol. 58; no. 3; pp. 877 - 892
Main Authors Su, Jian-Wei, Chou, Yen-Chi, Liu, Ruhui, Liu, Ta-Wei, Lu, Pei-Jung, Wu, Ping-Chun, Chung, Yen-Lin, Hong, Li-Yang, Ren, Jin-Sheng, Pan, Tianlong, Jhang, Chuan-Jia, Huang, Wei-Hsing, Chien, Chih-Han, Mei, Peng-I, Li, Sih-Han, Sheu, Shyh-Shyuan, Chang, Shih-Chieh, Lo, Wei-Chung, Wu, Chih-I, Si, Xin, Lo, Chung-Chuan, Liu, Ren-Shuo, Hsieh, Chih-Cheng, Tang, Kea-Tiong, Chang, Meng-Fan
Format Journal Article
LanguageEnglish
Published New York IEEE 01.03.2023
The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:Advances in static random access memory (SRAM)-CIM devices are meant to increase capacity while improving energy efficiency (EF) and reducing computing latency (<inline-formula> <tex-math notation="LaTeX">T_{\mathrm {AC}} </tex-math></inline-formula>). This work presents a novel SRAM-CIM structure using: 1) a segmented-bitline charge-sharing (SBCS) scheme for multiply-and-accumulate (MAC) operations with low energy consumption and a consistently high signal margin across MAC values; 2) a bitline-combining (BL-CMB) scheme to reduce the number of analog-to-digital converters (ADCs) and, thereby, provide options in determining a tradeoff between EF and inference accuracy; 3) a source-injection local-multiplication cell (SILMC) connected to two types of global-bitline-switch to support the SBCS and BL-CMB schemes with consistent signal margin against process variation in transistors; and 4) prioritized-hybrid ADC to suppress area and power overhead for analog readout operations. We fabricated a 28-nm 384-kb SRAM-CIM macro using foundry-provided compact-6T cells supporting MAC operations with 16 accumulations of 8-b input and 8-b weight with near-full precision output (20 b). This macro achieved <inline-formula> <tex-math notation="LaTeX">T_{\mathrm {AC}} </tex-math></inline-formula> of 7.2 ns and EF of 22.75 TOPS/W performing 8-b-MAC operations.
ISSN:0018-9200
1558-173X
DOI:10.1109/JSSC.2022.3199077