An Energy Efficient Time-Multiplexing Computing-in-Memory Architecture for Edge Intelligence
The growing data volume and complexity of deep neural networks (DNNs) require new architectures to surpass the limitation of the von-Neumann bottleneck, with computing-in-memory (CIM) as a promising direction for implementing energy-efficient neural networks. However, CIM's peripheral sensing c...
Saved in:
Published in | IEEE journal on exploratory solid-state computational devices and circuits Vol. 8; no. 2; pp. 111 - 118 |
---|---|
Main Authors | , , |
Format | Journal Article |
Language | English |
Published |
Piscataway
IEEE
01.12.2022
The Institute of Electrical and Electronics Engineers, Inc. (IEEE) |
Subjects | |
Online Access | Get full text |
Cover
Loading…
Summary: | The growing data volume and complexity of deep neural networks (DNNs) require new architectures to surpass the limitation of the von-Neumann bottleneck, with computing-in-memory (CIM) as a promising direction for implementing energy-efficient neural networks. However, CIM's peripheral sensing circuits are usually power- and area-hungry components. We propose a time-multiplexing CIM architecture (TM-CIM) based on memristive analog computing to share the peripheral circuits and process one column at a time. The memristor array is arranged in a column-wise manner that avoids wasting power/energy on unselected columns. In addition, digital-to-analog converter (DAC) power and energy efficiency, which turns out to be an even greater overhead than analog-to-digital converter (ADC), can be fine-tuned in TM-CIM for significant improvement. For a 256*256 crossbar array with a typical setting, TM-CIM saves <inline-formula> <tex-math notation="LaTeX">18.4\times </tex-math></inline-formula> in energy with 0.136 pJ/MAC efficiency, and <inline-formula> <tex-math notation="LaTeX">19.9\times </tex-math></inline-formula> area for 1T1R case and <inline-formula> <tex-math notation="LaTeX">15.9\times </tex-math></inline-formula> for 2T2R case. Performance estimation on VGG-16 indicates that TM-CIM can save over <inline-formula> <tex-math notation="LaTeX">16\times </tex-math></inline-formula> area. A tradeoff between the chip area, peak power, and latency is also presented, with a proposed scheme to further reduce the latency on VGG-16, without significantly increasing chip area and peak power. |
---|---|
ISSN: | 2329-9231 2329-9231 |
DOI: | 10.1109/JXCDC.2022.3206879 |