Low Precision RNNs: Quantizing RNNs Without Losing Accuracy
Similar to convolution neural networks, recurrent neural networks (RNNs) typically suffer from over-parameterization. Quantizing bit-widths of weights and activations results in runtime efficiency on hardware, yet it often comes at the cost of reduced accuracy. This paper proposes a quantization app...
Saved in:
Main Authors | , , |
---|---|
Format | Journal Article |
Language | English |
Published |
20.10.2017
|
Subjects | |
Online Access | Get full text |
Cover
Loading…
Summary: | Similar to convolution neural networks, recurrent neural networks (RNNs)
typically suffer from over-parameterization. Quantizing bit-widths of weights
and activations results in runtime efficiency on hardware, yet it often comes
at the cost of reduced accuracy. This paper proposes a quantization approach
that increases model size with bit-width reduction. This approach will allow
networks to perform at their baseline accuracy while still maintaining the
benefits of reduced precision and overall model size reduction. |
---|---|
DOI: | 10.48550/arxiv.1710.07706 |