QUANTIZATION-AWARE TRAINING WITH NUMERICAL OVERFLOW AVOIDANCE FOR NEURAL NETWORKS

An apparatus and method for efficiently creating less computationally intensive nodes for a neural network. In various implementations, a computing system includes a memory that stores multiple input data values for training a neural network, and a processor. Rather than determine a bit width P of a...

Full description

Saved in:
Bibliographic Details
Main Authors Colbert, Ian Charles, Raghavendra, Prakash Sathyanath, Ramachandran, Arun Coimbatore, Ramasamy, Chandra Kumar, Saeedi, Mehdi, Sines, Gabor, Pappalardo, Alessandro
Format Patent
LanguageEnglish
Published 13.06.2024
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:An apparatus and method for efficiently creating less computationally intensive nodes for a neural network. In various implementations, a computing system includes a memory that stores multiple input data values for training a neural network, and a processor. Rather than determine a bit width P of an integer accumulator of a node of the neural network based on bit widths of the input data values and corresponding weight values, the processor selects the bit width P during training. The processor adjusts the magnitudes of the weight values during iterative stages of training the node such that an L1 norm value of the weight values of the node does not exceed a corresponding weight magnitude limit.
Bibliography:Application Number: US202218065393