NEURAL NETWORK COMPRESSION

A neural network model is trained, where the training includes multiple training iterations. Weights of a particular layer of the neural network are pruned during a forward pass of a particular one of the training iterations. During the same forward pass of the particular training iteration, values...

Full description

Saved in:
Bibliographic Details
Main Authors Park, Mi Sun, Brick, Cormac M, Xu, Xiaofan
Format Patent
LanguageEnglish
Published 03.03.2022
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:A neural network model is trained, where the training includes multiple training iterations. Weights of a particular layer of the neural network are pruned during a forward pass of a particular one of the training iterations. During the same forward pass of the particular training iteration, values of weights of the particular layer are quantized to determine a quantized-sparsified subset of weights for the particular layer. A compressed version of the neural network model is generated from the training based at least in part on the quantized-sparsified subset of weights.
Bibliography:Application Number: US201917416461