ASPoT8: An Efficient 8-bit Quantization Balancing Hardware and Accuracy for Object Detection

Hardware accelerators using fixed-point quantization efficiently run object detection neural networks, but high-bit quantization demands substantial hardware and power, while low-bit quantization sacrifices accuracy. To address this, we introduce an 8-bit quantization scheme, ASPoT8, which uses add/...

Full description

Saved in:
Bibliographic Details
Published inIEICE Transactions on Information and Systems Vol. E108.D; no. 9; pp. 1146 - 1149
Main Authors LI, Hui, YANG, Xiaofeng, ZHENG, Zebin, LI, Jinyi, LU, Shengli
Format Journal Article
LanguageEnglish
Published The Institute of Electronics, Information and Communication Engineers 01.09.2025
一般社団法人 電子情報通信学会
Subjects
Online AccessGet full text
ISSN0916-8532
1745-1361
DOI10.1587/transinf.2024EDL8089

Cover

More Information
Summary:Hardware accelerators using fixed-point quantization efficiently run object detection neural networks, but high-bit quantization demands substantial hardware and power, while low-bit quantization sacrifices accuracy. To address this, we introduce an 8-bit quantization scheme, ASPoT8, which uses add/shift operations to replace INT8 multiplications, minimizing hardware area and power consumption without compromising accuracy. ASPoT8 adjusts quantified value distribution to match INT8’s accuracy. Tests on YOLOV3 Tiny and MobileNetV2 SSDlite show minimal mAP drops of 0.5% and 0.2%, respectively, with significant reductions in power (76.31%), delay (29.46%), and area (58.40%) over INT8, based on SMIC 40nm.
ISSN:0916-8532
1745-1361
DOI:10.1587/transinf.2024EDL8089