A Survey and Taxonomy of FPGA-based Deep Learning Accelerators

•This paper is original given that serve as directive for researchers in the area of FPGA-based deep learning accelerators.•This paper analyzes the characteristics of existing architectures to finally propose the better development strategies.•The literature used in this paper is very recent (many r...

Full description

Saved in:
Bibliographic Details
Published inJournal of systems architecture Vol. 98; pp. 331 - 345
Main Authors Blaiech, Ahmed Ghazi, Ben Khalifa, Khaled, Valderrama, Carlos, Fernandes, Marcelo A.C., Bedoui, Mohamed Hedi
Format Journal Article
LanguageEnglish
Published Elsevier B.V 01.09.2019
Elsevier
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:•This paper is original given that serve as directive for researchers in the area of FPGA-based deep learning accelerators.•This paper analyzes the characteristics of existing architectures to finally propose the better development strategies.•The literature used in this paper is very recent (many references are in 2018). Deep learning, the fastest growing segment of Artificial Neural Network (ANN), has led to the emergence of many machine learning applications and their implementation across multiple platforms such as CPUs, GPUs and reconfigurable hardware (Field-Programmable Gate Arrays or FPGAs). However, inspired by the structure and function of ANNs, large-scale deep learning topologies require a considerable amount of parallel processing, memory resources, high throughput and significant processing power. Consequently, in the context of real time hardware systems, it is crucial to find the right trade-off between performance, energy efficiency, fast development, and cost. Although limited in size and resources, several approaches have showed that FPGAs provide a good starting point for the development of future deep learning implementation architectures. Through this paper, we briefly review recent work related to the implementation of deep learning algorithms in FPGAs. We will analyze and compare the design requirements and features of existing topologies to finally propose development strategies and implementation architectures for better use of FPGA-based deep learning topologies. In this context, we will examine the frameworks used in these studies, which will allow testing a lot of topologies to finally arrive at the best implementation alternatives in terms of performance and energy efficiency.
ISSN:1383-7621
1873-6165
DOI:10.1016/j.sysarc.2019.01.007