QUANTIZATION-AWARE TRAINING WITH NUMERICAL OVERFLOW AVOIDANCE FOR NEURAL NETWORKS
An apparatus and method for efficiently creating less computationally intensive nodes for a neural network. In various implementations, a computing system includes a memory that stores multiple input data values for training a neural network, and a processor. Rather than determine a bit width P of a...
Saved in:
Main Authors | , , , , , , |
---|---|
Format | Patent |
Language | English |
Published |
13.06.2024
|
Subjects | |
Online Access | Get full text |
Cover
Loading…
Summary: | An apparatus and method for efficiently creating less computationally intensive nodes for a neural network. In various implementations, a computing system includes a memory that stores multiple input data values for training a neural network, and a processor. Rather than determine a bit width P of an integer accumulator of a node of the neural network based on bit widths of the input data values and corresponding weight values, the processor selects the bit width P during training. The processor adjusts the magnitudes of the weight values during iterative stages of training the node such that an L1 norm value of the weight values of the node does not exceed a corresponding weight magnitude limit. |
---|---|
Bibliography: | Application Number: US202218065393 |