NEURAL NETWORK COMPRESSION

A neural network model is trained, where the training includes multiple training iterations. Weights of a particular layer of the neural network are pruned during a forward pass of a particular one of the training iterations. During the same forward pass of the particular training iteration, values...

Full description

Saved in:

Bibliographic Details
Main Authors	Park, Mi Sun, Brick, Cormac M, Xu, Xiaofan
Format	Patent
Language	English
Published	03.03.2022
Subjects	BASIC ELECTRONIC CIRCUITRY CALCULATING CODE CONVERSION IN GENERAL CODING COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS COMPUTING COUNTING DECODING ELECTRICITY PHYSICS
Online Access	Get full text

Cover

Loading…

More Information
Summary:	A neural network model is trained, where the training includes multiple training iterations. Weights of a particular layer of the neural network are pruned during a forward pass of a particular one of the training iterations. During the same forward pass of the particular training iteration, values of weights of the particular layer are quantized to determine a quantized-sparsified subset of weights for the particular layer. A compressed version of the neural network model is generated from the training based at least in part on the quantized-sparsified subset of weights.
Bibliography:	Application Number: US201917416461