Quantization-Guided Training for Compact TinyML Models

We propose a Quantization Guided Training (QGT) method to guide DNN training towards optimized low-bit-precision targets and reach extreme compression levels below 8-bit precision. Unlike standard quantization-aware training (QAT) approaches, QGT uses customized regularization to encourage weight va...

Full description

Saved in:

Bibliographic Details
Published in	arXiv.org
Main Authors	Ghamari, Sedigh, Ozcan, Koray, Dinh, Thu, Melnikov, Andrey, Carvajal, Juan, Ernst, Jan, Chai, Sek
Format	Paper
Language	English
Published	Ithaca Cornell University Library, arXiv.org 10.03.2021
Subjects	Floating point arithmetic Measurement Regularization Size reduction Training
Online Access	Get full text

Cover

Loading…

More Information
Summary:	We propose a Quantization Guided Training (QGT) method to guide DNN training towards optimized low-bit-precision targets and reach extreme compression levels below 8-bit precision. Unlike standard quantization-aware training (QAT) approaches, QGT uses customized regularization to encourage weight values towards a distribution that maximizes accuracy while reducing quantization errors. One of the main benefits of this approach is the ability to identify compression bottlenecks. We validate QGT using state-of-the-art model architectures on vision datasets. We also demonstrate the effectiveness of QGT with an 81KB tiny model for person detection down to 2-bit precision (representing 17.7x size reduction), while maintaining an accuracy drop of only 3% compared to a floating-point baseline.
ISSN:	2331-8422