Design and Implementation of a Universal Shift Convolutional Neural Network Accelerator
Currently, many applications implement convolutional neural networks (CNNs) on CPUs or GPUs while the performances are limited by the computational complexity of these networks. Compared with the implementation on CPUs or GPUs, deploying convolutional neural accelerators on FPGAs can achieve superio...
Saved in:
Published in | IEEE embedded systems letters Vol. 16; no. 1; pp. 17 - 20 |
---|---|
Main Authors | , , , |
Format | Journal Article |
Language | English |
Published |
Piscataway
IEEE
01.03.2024
The Institute of Electrical and Electronics Engineers, Inc. (IEEE) |
Subjects | |
Online Access | Get full text |
Cover
Loading…
Summary: | Currently, many applications implement convolutional neural networks (CNNs) on CPUs or GPUs while the performances are limited by the computational complexity of these networks. Compared with the implementation on CPUs or GPUs, deploying convolutional neural accelerators on FPGAs can achieve superior performance. On the other side, the multiplication operations of CNNs have been a constraint for FPGAs to achieve the better performance. In this letter, we proposed a shift CNN accelerator, which converts the multiplication operations into shift operations. Based on the shift operation, our accelerator can break the computational bottleneck of FPGAs. On Virtex UltraScale+ VU9P, our accelerator can save DSP resources and reduce memory consumption while achieving a performance of 1.18 Tera Operations Per Second (TOPS), which is an essential improvement over the other convolutional neural accelerators. |
---|---|
ISSN: | 1943-0663 1943-0671 |
DOI: | 10.1109/LES.2022.3233796 |