QUANTIZATION-AWARE TRAINING WITH NUMERICAL OVERFLOW AVOIDANCE FOR NEURAL NETWORKS

An apparatus and method for efficiently creating less computationally intensive nodes for a neural network. In various implementations, a computing system includes a memory that stores multiple input data values for training a neural network, and a processor. Rather than determine a bit width P of a...

Full description

Saved in:

Bibliographic Details
Main Authors	Colbert, Ian Charles, Raghavendra, Prakash Sathyanath, Ramachandran, Arun Coimbatore, Ramasamy, Chandra Kumar, Saeedi, Mehdi, Sines, Gabor, Pappalardo, Alessandro
Format	Patent
Language	English
Published	13.06.2024
Subjects	CALCULATING COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS COMPUTING COUNTING PHYSICS
Online Access	Get full text

Cover

Loading…

More Information
Summary:	An apparatus and method for efficiently creating less computationally intensive nodes for a neural network. In various implementations, a computing system includes a memory that stores multiple input data values for training a neural network, and a processor. Rather than determine a bit width P of an integer accumulator of a node of the neural network based on bit widths of the input data values and corresponding weight values, the processor selects the bit width P during training. The processor adjusts the magnitudes of the weight values during iterative stages of training the node such that an L1 norm value of the weight values of the node does not exceed a corresponding weight magnitude limit.
Bibliography:	Application Number: US202218065393