Quantization-Guided Training for Compact TinyML Models

We propose a Quantization Guided Training (QGT) method to guide DNN training towards optimized low-bit-precision targets and reach extreme compression levels below 8-bit precision. Unlike standard quantization-aware training (QAT) approaches, QGT uses customized regularization to encourage weight va...

Full description

Saved in:
Bibliographic Details
Published inarXiv.org
Main Authors Ghamari, Sedigh, Ozcan, Koray, Dinh, Thu, Melnikov, Andrey, Carvajal, Juan, Ernst, Jan, Chai, Sek
Format Paper
LanguageEnglish
Published Ithaca Cornell University Library, arXiv.org 10.03.2021
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:We propose a Quantization Guided Training (QGT) method to guide DNN training towards optimized low-bit-precision targets and reach extreme compression levels below 8-bit precision. Unlike standard quantization-aware training (QAT) approaches, QGT uses customized regularization to encourage weight values towards a distribution that maximizes accuracy while reducing quantization errors. One of the main benefits of this approach is the ability to identify compression bottlenecks. We validate QGT using state-of-the-art model architectures on vision datasets. We also demonstrate the effectiveness of QGT with an 81KB tiny model for person detection down to 2-bit precision (representing 17.7x size reduction), while maintaining an accuracy drop of only 3% compared to a floating-point baseline.
ISSN:2331-8422