ASPoT8: An Efficient 8-bit Quantization Balancing Hardware and Accuracy for Object Detection
Hardware accelerators using fixed-point quantization efficiently run object detection neural networks, but high-bit quantization demands substantial hardware and power, while low-bit quantization sacrifices accuracy. To address this, we introduce an 8-bit quantization scheme, ASPoT8, which uses add/...
Saved in:
Published in | IEICE Transactions on Information and Systems Vol. E108.D; no. 9; pp. 1146 - 1149 |
---|---|
Main Authors | , , , , |
Format | Journal Article |
Language | English |
Published |
The Institute of Electronics, Information and Communication Engineers
01.09.2025
一般社団法人 電子情報通信学会 |
Subjects | |
Online Access | Get full text |
ISSN | 0916-8532 1745-1361 |
DOI | 10.1587/transinf.2024EDL8089 |
Cover
Summary: | Hardware accelerators using fixed-point quantization efficiently run object detection neural networks, but high-bit quantization demands substantial hardware and power, while low-bit quantization sacrifices accuracy. To address this, we introduce an 8-bit quantization scheme, ASPoT8, which uses add/shift operations to replace INT8 multiplications, minimizing hardware area and power consumption without compromising accuracy. ASPoT8 adjusts quantified value distribution to match INT8’s accuracy. Tests on YOLOV3 Tiny and MobileNetV2 SSDlite show minimal mAP drops of 0.5% and 0.2%, respectively, with significant reductions in power (76.31%), delay (29.46%), and area (58.40%) over INT8, based on SMIC 40nm. |
---|---|
ISSN: | 0916-8532 1745-1361 |
DOI: | 10.1587/transinf.2024EDL8089 |