A Survey and Taxonomy of FPGA-based Deep Learning Accelerators

•This paper is original given that serve as directive for researchers in the area of FPGA-based deep learning accelerators.•This paper analyzes the characteristics of existing architectures to finally propose the better development strategies.•The literature used in this paper is very recent (many r...

Full description

Saved in:

Bibliographic Details
Published in	Journal of systems architecture Vol. 98; pp. 331 - 345
Main Authors	Blaiech, Ahmed Ghazi, Ben Khalifa, Khaled, Valderrama, Carlos, Fernandes, Marcelo A.C., Bedoui, Mohamed Hedi
Format	Journal Article
Language	English
Published	Elsevier B.V 01.09.2019 Elsevier
Subjects	Computer Science Deep learning FPGA Framework Optimized implementation Deep learning FPGA Framework Optimized implementation
Online Access	Get full text

Cover

Loading…

More Information
Summary:	•This paper is original given that serve as directive for researchers in the area of FPGA-based deep learning accelerators.•This paper analyzes the characteristics of existing architectures to finally propose the better development strategies.•The literature used in this paper is very recent (many references are in 2018). Deep learning, the fastest growing segment of Artificial Neural Network (ANN), has led to the emergence of many machine learning applications and their implementation across multiple platforms such as CPUs, GPUs and reconfigurable hardware (Field-Programmable Gate Arrays or FPGAs). However, inspired by the structure and function of ANNs, large-scale deep learning topologies require a considerable amount of parallel processing, memory resources, high throughput and significant processing power. Consequently, in the context of real time hardware systems, it is crucial to find the right trade-off between performance, energy efficiency, fast development, and cost. Although limited in size and resources, several approaches have showed that FPGAs provide a good starting point for the development of future deep learning implementation architectures. Through this paper, we briefly review recent work related to the implementation of deep learning algorithms in FPGAs. We will analyze and compare the design requirements and features of existing topologies to finally propose development strategies and implementation architectures for better use of FPGA-based deep learning topologies. In this context, we will examine the frameworks used in these studies, which will allow testing a lot of topologies to finally arrive at the best implementation alternatives in terms of performance and energy efficiency.
ISSN:	1383-7621 1873-6165
DOI:	10.1016/j.sysarc.2019.01.007