FPGA-based Acceleration of Binary Neural Network Training with Minimized Off-Chip Memory Access

In this paper, we examine the feasibility of FPGA as a platform for training a convolutional binary-weight neural network. Training a neural network requires more data movement compared to inference. Acceleration of training on an FPGA is, therefore, a challenge because the data movement increases o...

Full description

Saved in:
Bibliographic Details
Published in2019 IEEE/ACM International Symposium on Low Power Electronics and Design (ISLPED) pp. 1 - 6
Main Authors Chundi, Pavan Kumar, Liu, Peiye, Park, Sangsu, Lee, Seho, Seok, Mingoo
Format Conference Proceeding
LanguageEnglish
Published IEEE 01.07.2019
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:In this paper, we examine the feasibility of FPGA as a platform for training a convolutional binary-weight neural network. Training a neural network requires more data movement compared to inference. Acceleration of training on an FPGA is, therefore, a challenge because the data movement increases off-chip memory accesses. We try to address this problem by storing most of the data in the on-chip memory and adopting batch renormalization. This allows for training a large network by reducing the required intermediate data and its movement. For the case where all data except the input images can be stored on an FPGA chip, we present an accelerator for training CNNs to classify the CIFAR-10 dataset. Further, we study the impact of network size on performance and energy of FPGA and GPU. Our accelerator mapped in the Arria 10 FPGA chip obtains up-to 9.33X higher energy efficiency compared to the Nvidia Geforce GTX 1080 Ti GPU at similar performance.
DOI:10.1109/ISLPED.2019.8824805