Low Precision RNNs: Quantizing RNNs Without Losing Accuracy

Similar to convolution neural networks, recurrent neural networks (RNNs) typically suffer from over-parameterization. Quantizing bit-widths of weights and activations results in runtime efficiency on hardware, yet it often comes at the cost of reduced accuracy. This paper proposes a quantization app...

Full description

Saved in:

Bibliographic Details
Main Authors	Kapur, Supriya, Mishra, Asit, Marr, Debbie
Format	Journal Article
Language	English
Published	20.10.2017
Subjects	Computer Science - Artificial Intelligence Computer Science - Learning Statistics - Machine Learning
Online Access	Get full text

Cover

Loading…

More Information
Summary:	Similar to convolution neural networks, recurrent neural networks (RNNs) typically suffer from over-parameterization. Quantizing bit-widths of weights and activations results in runtime efficiency on hardware, yet it often comes at the cost of reduced accuracy. This paper proposes a quantization approach that increases model size with bit-width reduction. This approach will allow networks to perform at their baseline accuracy while still maintaining the benefits of reduced precision and overall model size reduction.
DOI:	10.48550/arxiv.1710.07706