Flexible accelerator for sparse tensors in convolutional neural networks
An apparatus includes a tensor compute cluster having a plurality of tensor compute units to process a plurality of sub-feature maps in a machine learning application and a tensor memory cluster having a plurality of tensor feature map memory units to store the plurality of sub-feature maps. The app...
Saved in:
Main Authors | , , |
---|---|
Format | Patent |
Language | English |
Published |
24.10.2023
|
Subjects | |
Online Access | Get full text |
Cover
Loading…
Summary: | An apparatus includes a tensor compute cluster having a plurality of tensor compute units to process a plurality of sub-feature maps in a machine learning application and a tensor memory cluster having a plurality of tensor feature map memory units to store the plurality of sub-feature maps. The apparatus also includes circuitry to partition an input feature map into the plurality of sub-feature maps such that sparsity in each of the plurality of sub-feature maps satisfies a predetermined threshold, and assign each of the plurality of sub-feature maps to one of the plurality of tensor compute units and one of the plurality of tensor feature map memory units for processing in parallel. |
---|---|
Bibliography: | Application Number: US202016830129 |