Robustness-aware 2-bit quantization with real-time performance for neural network

Quantized neural networks (NN) with reduced bit precision are practical solutions to minimize computational and memory resource requirements and play a vital role in machine learning. However, it is still challenging to avoid significant accuracy degradation due to numerical approximation and lower...

Full description

Saved in:
Bibliographic Details
Published inNeurocomputing (Amsterdam) Vol. 455; pp. 12 - 22
Main Authors Li, Xiaobin, Jiang, Hongxu, Zhang, Runhua, Tian, Fangzheng, Huang, Shuangxi, Xu, Donghuan
Format Journal Article
LanguageEnglish
Published Elsevier B.V 30.09.2021
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:Quantized neural networks (NN) with reduced bit precision are practical solutions to minimize computational and memory resource requirements and play a vital role in machine learning. However, it is still challenging to avoid significant accuracy degradation due to numerical approximation and lower redundancy. In this paper, a novel robustness-aware 2-bit quantization scheme (RAQ) is proposed for NN, based on binary NN and generative adversarial networks (GAN), which improve performance by enriching binary NN information, extracting the structural information and considering the robustness of the quantized NN. Specifically, using a shift-add operation to replace the multiply-accumulate in the quantization process can speed the NN. A structural loss is proposed to represent the difference between the original NN and quantized NN, such that the structural information of data is preserved after quantization. The structural information learned from NN plays an important role in improving the performance and allows for further fine-tuning of the quantized NN by applying the Lipschitz constraint to the structural loss. For the first time, we consider the robustness of the quantized NN and propose a non-sensitive perturbation loss function by introducing an extraneous term of the spectral norm. The experiments were conducted on CIFAR-10, SVHN and ImageNet datasets with popular NN (such as MobileNetV2, ResNet20, etc.). Extensive experiments show that our new 2-bit quantization scheme is more efficient than the state-of-the-art quantization methods. Our scheme effectively reduced the latency by 2 × and the accuracy decline by 1–4%. Meanwhile, the experimental results also demonstrate that the RAQ is robust with adversarial attacks, we not only eliminate the robustness gap between full-precision and quantized models, but also improve the robustness over full-precision ones by 10%.
ISSN:0925-2312
1872-8286
DOI:10.1016/j.neucom.2021.05.006