NEURAL NETWORK PROCESSOR USING COMPRESSION AND DECOMPRESSION OF ACTIVATION DATA TO REDUCE MEMORY BANDWIDTH UTILIZATION

A deep neural network ("DNN") module can compress and decompress neuron-generated activation data to reduce the utilization of memory bus bandwidth. The compression unit (200) can receive an uncompressed chunk of data (202) generated by a neuron in the DNN module. The compression unit gene...

Full description

Saved in:
Bibliographic Details
Main Authors Amol Ashok AMBARDEKAR, Kent D. CEDOLA, Chad Balling MCBRIDE, George PETRE, Larry Marvin WALL, Benjamin Eliot LUNDELL, Joseph Leon CORKERY, Boris BOBROV
Format Patent
LanguageEnglish
Published 21.03.2024
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:A deep neural network ("DNN") module can compress and decompress neuron-generated activation data to reduce the utilization of memory bus bandwidth. The compression unit (200) can receive an uncompressed chunk of data (202) generated by a neuron in the DNN module. The compression unit generates a mask portion (208) and a data portion (210) of a compressed output chunk. The mask portion encodes the presence and location of the zero and non-zero bytes in the uncompressed chunk of data. The data portion stores truncated non-zero bytes from the uncompressed chunk of data. A decompression unit (500) can receive a compressed chunk of data (204) from memory in the DNN processor or memory of an application host. The decompression unit decompresses the compressed chunk of data using the mask portion (208) and the data portion (210). This can reduce memory bus utilization, allow a DNN module to complete processing operations more quickly, and reduce power consumption. (Figure 4)
Bibliography:Application Number: MY2019PI06051