Block-Wise Dynamic-Precision Neural Network Training Acceleration via Online Quantization Sensitivity Analytics

Data quantization is an effective method to accelerate neural network training and reduce power consumption. However, it is challenging to perform low-bit quantized training: the conventional equal-precision quantization will lead to either high accuracy loss or limited bit-width reduction, while ex...

Full description

Saved in:

Bibliographic Details
Published in	Proceedings of the 28th Asia and South Pacific Design Automation Conference pp. 372 - 377
Main Authors	Liu, Ruoyang, Wei, Chenhan, Yang, Yixiong, Wang, Wenxun, Yang, Huazhong, Liu, Yongpan
Format	Conference Proceeding
Language	English
Published	New York, NY, USA ACM 16.01.2023
Series	ACM Conferences
Subjects	Computing methodologies > Artificial intelligence > Philosophical/theoretical foundations of artificial intelligence Computing methodologies > Machine learning > Machine learning approaches > Neural networks fully-quantized network training mixed-precision quantization neural network training acceleration
Online Access	Get full text

Cover

Loading…

More Information
Summary:	Data quantization is an effective method to accelerate neural network training and reduce power consumption. However, it is challenging to perform low-bit quantized training: the conventional equal-precision quantization will lead to either high accuracy loss or limited bit-width reduction, while existing mixed-precision methods offer high compression potential but failed to perform accurate and efficient bit-width assignment. In this work, we propose DYNASTY, a block-wise dynamic-precision neural network training framework. DYNASTY provides accurate data sensitivity information through fast online analytics, and maintains stable training convergence with an adaptive bit-width map generator. Network training experiments on CIFAR-100 and ImageNet dataset are carried out, and compared to 8-bit quantization baseline, DYNASTY brings up to 5.1× speedup and 4.7× energy consumption reduction with no accuracy drop and negligible hardware overhead.
ISBN:	9781450397834 1450397832
DOI:	10.1145/3566097.3567876