MULTI-LAYER NEURAL NETWORK PROCESSING BY A NEURAL NETWORK ACCELERATOR USING HOST COMMUNICATED MERGED WEIGHTS AND A PACKAGE OF PER-LAYER INSTRUCTIONS

In the disclosed methods and systems for processing in a neural network system, a host computer system (402) writes (602) a plurality of weight matrices associated with a plurality of layers of a neural network to a memory (226) shared with a neural network accelerator (238). The host computer syste...

Full description

Saved in:

Bibliographic Details
Main Authors	NG, Aaron, TENG, Xiao, SETTLE, Sean, GHASEMI, Ehsan, ZEJDA, Jindrich, SIRASAO, Ashish, WU, Yongjun, DELAYE, Elliott
Format	Patent
Language	English French
Published	25.04.2019
Subjects	CALCULATING COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS COMPUTING COUNTING PHYSICS
Online Access	Get full text

Cover

Loading…

More Information
Summary:	In the disclosed methods and systems for processing in a neural network system, a host computer system (402) writes (602) a plurality of weight matrices associated with a plurality of layers of a neural network to a memory (226) shared with a neural network accelerator (238). The host computer system further assembles (610) a plurality of per-layer instructions into an instruction package. Each per-layer instruction specifies processing of a respective layer of the plurality of layers of the neural network, and respective offsets of weight matrices in a shared memory. The host computer system writes (612, 614) input data and the instruction package to the shared memory. The neural network accelerator reads (702) the instruction package from the shared memory and processes (702-712) the plurality of per-layer instructions of the instruction package. Dans les procédés et systèmes décrits pour le traitement dans un système de réseau neuronal, un système informatique hôte (402) écrit (602) une pluralité de matrices de poids associées à une pluralité de couches d'un réseau neuronal à une mémoire (226) partagée avec un accélérateur de réseau neuronal (238). Le système informatique hôte assemble en outre (610) une pluralité d'instructions par couche dans un paquet d'instructions. Chaque instruction par couche spécifie le traitement d'une couche respective de la pluralité de couches du réseau neuronal, et des décalages respectifs de matrices de poids dans une mémoire partagée. Le système informatique hôte écrit (612, 614) des données d'entrée et le paquet d'instructions dans la mémoire partagée. L'accélérateur de réseau neuronal lit (702) le paquet d'instructions à partir de la mémoire partagée et traite (702-712) la pluralité d'instructions par couche du paquet d'instructions.
Bibliography:	Application Number: WO2018US56112