A Survey and Taxonomy of FPGA-based Deep Learning Accelerators
•This paper is original given that serve as directive for researchers in the area of FPGA-based deep learning accelerators.•This paper analyzes the characteristics of existing architectures to finally propose the better development strategies.•The literature used in this paper is very recent (many r...
Saved in:
Published in | Journal of systems architecture Vol. 98; pp. 331 - 345 |
---|---|
Main Authors | , , , , |
Format | Journal Article |
Language | English |
Published |
Elsevier B.V
01.09.2019
Elsevier |
Subjects | |
Online Access | Get full text |
Cover
Loading…
Summary: | •This paper is original given that serve as directive for researchers in the area of FPGA-based deep learning accelerators.•This paper analyzes the characteristics of existing architectures to finally propose the better development strategies.•The literature used in this paper is very recent (many references are in 2018).
Deep learning, the fastest growing segment of Artificial Neural Network (ANN), has led to the emergence of many machine learning applications and their implementation across multiple platforms such as CPUs, GPUs and reconfigurable hardware (Field-Programmable Gate Arrays or FPGAs). However, inspired by the structure and function of ANNs, large-scale deep learning topologies require a considerable amount of parallel processing, memory resources, high throughput and significant processing power. Consequently, in the context of real time hardware systems, it is crucial to find the right trade-off between performance, energy efficiency, fast development, and cost. Although limited in size and resources, several approaches have showed that FPGAs provide a good starting point for the development of future deep learning implementation architectures. Through this paper, we briefly review recent work related to the implementation of deep learning algorithms in FPGAs. We will analyze and compare the design requirements and features of existing topologies to finally propose development strategies and implementation architectures for better use of FPGA-based deep learning topologies. In this context, we will examine the frameworks used in these studies, which will allow testing a lot of topologies to finally arrive at the best implementation alternatives in terms of performance and energy efficiency. |
---|---|
ISSN: | 1383-7621 1873-6165 |
DOI: | 10.1016/j.sysarc.2019.01.007 |