Performance Optimisation of Parallelized ADAS Applications in FPGA-GPU Heterogeneous Systems: A Case Study With Lane Detection

The explosive growth of massive data captured by various sensors on modern vehicles has impelled the deployment of Commercial Off-The-Shelf (COTS) accelerators for the research and development of Advanced Driver Assistance Systems (ADAS). Although the advent of cross-platform programming framework s...

Full description

Saved in:

Bibliographic Details
Published in	IEEE transactions on intelligent vehicles Vol. 4; no. 4; pp. 519 - 531
Main Authors	Wang, Xiebing, Huang, Kai, Knoll, Alois
Format	Journal Article
Language	English
Published	Piscataway IEEE 01.12.2019 The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Subjects	Accelerators ADAS Advanced driver assistance systems Commercial off-the-shelf technology Computer languages Explosives detection Field programmable gate arrays FPGA GPU Graphics processing units Lane detection OpenCL Optimization Parallel processing R&D Research & development Run time (computers)
Online Access	Get full text

Cover

Loading…

More Information
Summary:	The explosive growth of massive data captured by various sensors on modern vehicles has impelled the deployment of Commercial Off-The-Shelf (COTS) accelerators for the research and development of Advanced Driver Assistance Systems (ADAS). Although the advent of cross-platform programming framework such as Open Computing Language (OpenCL) facilitates the programmability of ADAS applications on heterogeneous devices, the performance portability is still vulnerable and subject to different hardware implementations by the heterogeneous manufacturers. With this issue in mind, in this article we propose a detailed procedure that helps guide the performance optimisation of parallelized ADAS applications in an FPGA-GPU combined heterogeneous system. Taking two different lane detection applications as case studies, we provide one intra-accelerator and two interaccelerator optimisation methods, as well as both FPGA-specific and application-oriented optimisation strategies, to boost the program runtime performance. Experiment results on a heterogeneous platform with COTS FPGA and GPU components reveal that the optimal designs generated from the procedure can improve the runtime performance of the two applications by an average of 109.21% and 83.48% over the native parallel implementations, respectively.
Bibliography:	ObjectType-Case Study-2 SourceType-Scholarly Journals-1 content type line 14 ObjectType-Feature-4 ObjectType-Report-1 ObjectType-Article-3
ISSN:	2379-8858 2379-8904
DOI:	10.1109/TIV.2019.2938092