Compiler for optimizing memory allocations within cores
Some embodiments provide a compiler for optimizing the implementation of a machine-trained network (e.g., a neural network) on an integrated circuit (IC). The compiler of some embodiments receives a specification of a machine-trained network including multiple layers of computation nodes and generat...
Saved in:
Main Authors | , |
---|---|
Format | Patent |
Language | English |
Published |
09.01.2024
|
Subjects | |
Online Access | Get full text |
Cover
Loading…
Summary: | Some embodiments provide a compiler for optimizing the implementation of a machine-trained network (e.g., a neural network) on an integrated circuit (IC). The compiler of some embodiments receives a specification of a machine-trained network including multiple layers of computation nodes and generates a graph representing options for implementing the machine-trained network in the IC. In some embodiments, the graph includes nodes representing options for implementing each layer of the machine-trained network and edges between nodes for different layers representing different implementations that are compatible. The graph is used, in some embodiments, to select an optimum set of cores for implementing the received machine-trained network. The compiler, in some embodiments, optimizes memory storage such that input and output layers of a single layer are not stored in a same memory unit. Such an optimization, in some embodiments, avoids attempting to read and write from a same memory unit within a core in a single clock cycle. |
---|---|
Bibliography: | Application Number: US201916525449 |