ASPoT8: An Efficient 8-bit Quantization Balancing Hardware and Accuracy for Object Detection

Hardware accelerators using fixed-point quantization efficiently run object detection neural networks, but high-bit quantization demands substantial hardware and power, while low-bit quantization sacrifices accuracy. To address this, we introduce an 8-bit quantization scheme, ASPoT8, which uses add/...

Full description

Saved in:

Bibliographic Details
Published in	IEICE Transactions on Information and Systems Vol. E108.D; no. 9; pp. 1146 - 1149
Main Authors	LI, Hui, YANG, Xiaofeng, ZHENG, Zebin, LI, Jinyi, LU, Shengli
Format	Journal Article
Language	English
Published	The Institute of Electronics, Information and Communication Engineers 01.09.2025 一般社団法人電子情報通信学会
Subjects	Hardware-Friendly accelerators Model Quantization Neural Network Object Detection
Online Access	Get full text
ISSN	0916-8532 1745-1361
DOI	10.1587/transinf.2024EDL8089

Cover

More Information
Summary:	Hardware accelerators using fixed-point quantization efficiently run object detection neural networks, but high-bit quantization demands substantial hardware and power, while low-bit quantization sacrifices accuracy. To address this, we introduce an 8-bit quantization scheme, ASPoT8, which uses add/shift operations to replace INT8 multiplications, minimizing hardware area and power consumption without compromising accuracy. ASPoT8 adjusts quantified value distribution to match INT8’s accuracy. Tests on YOLOV3 Tiny and MobileNetV2 SSDlite show minimal mAP drops of 0.5% and 0.2%, respectively, with significant reductions in power (76.31%), delay (29.46%), and area (58.40%) over INT8, based on SMIC 40nm.
ISSN:	0916-8532 1745-1361
DOI:	10.1587/transinf.2024EDL8089