A Memory-Efficient CNN Accelerator Using Segmented Logarithmic Quantization and Multi-Cluster Architecture
This brief presents a memory-efficient CNN accelerator design for resource-constrained devices in Internet of Things (IoT) and autonomous systems. A segmented logarithmic (SegLog) quantization method is exploited to mitigate the on-chip memory and bandwidth requirements, thus accommodating more proc...
Saved in:
Published in | IEEE transactions on circuits and systems. II, Express briefs Vol. 68; no. 6; pp. 2142 - 2146 |
---|---|
Main Authors | , , , , , , |
Format | Journal Article |
Language | English |
Published |
New York
IEEE
01.06.2021
The Institute of Electrical and Electronics Engineers, Inc. (IEEE) |
Subjects | |
Online Access | Get full text |
Cover
Loading…
Summary: | This brief presents a memory-efficient CNN accelerator design for resource-constrained devices in Internet of Things (IoT) and autonomous systems. A segmented logarithmic (SegLog) quantization method is exploited to mitigate the on-chip memory and bandwidth requirements, thus accommodating more processing elements (PEs) in a given chip area to organize a reconfigurable multi-cluster architecture. The evaluation results show that SegLog quantization can achieve <inline-formula> <tex-math notation="LaTeX">6.4\times </tex-math></inline-formula> model compression with less than 2.5% accuracy loss on various CNNs. An ASIC implementation with 168 PEs configuration is validated in a 40-nm CMOS process, with 2.54 TOPs/W energy efficiency and 0.8 mm 2 chip area reported. The accelerator has also been implemented on FPGA with 1512 PEs configured and 468 kB on-chip memory, achieving a 1.29 GOPs/kB memory efficiency. Compared with the state-of-the-art accelerators, our ASIC implementation enhances area efficiency and arithmetic intensity by <inline-formula> <tex-math notation="LaTeX">1.94\times </tex-math></inline-formula> and <inline-formula> <tex-math notation="LaTeX">5.62\times </tex-math></inline-formula>, while the FPGA implementation achieves the memory efficiency improvement by a factor of <inline-formula> <tex-math notation="LaTeX">2.34\times </tex-math></inline-formula>. |
---|---|
ISSN: | 1549-7747 1558-3791 1558-3791 |
DOI: | 10.1109/TCSII.2020.3038897 |