Application Level Resource Scheduling for Deep Learning Acceleration on MPSoC
Deep Neutral Networks (DNNs) have been widely used in many applications, such as self-driving cars, natural language processing (NLP), image classification, visual object recognition, and so on. Field-programmable gate array (FPGA) based Multiprocessor System on a Chip (MPSoC) is recently considered...
Saved in:
Published in | Journal of signal processing systems Vol. 95; no. 10; pp. 1231 - 1243 |
---|---|
Main Authors | , , , , , |
Format | Journal Article |
Language | English |
Published |
New York
Springer US
01.10.2023
Springer Nature B.V |
Subjects | |
Online Access | Get full text |
Cover
Loading…
Summary: | Deep Neutral Networks (DNNs) have been widely used in many applications, such as self-driving cars, natural language processing (NLP), image classification, visual object recognition, and so on. Field-programmable gate array (FPGA) based Multiprocessor System on a Chip (MPSoC) is recently considered one of the popular choices for deploying DNN models. However, the limited resource capacity of MPSoC imposes a challenge for such practical implementation. Recent studies revealed the trade-off between the “resources consumed" vs. the “performance achieved". Taking a cue from these findings, we address the problem of efficient implementation of deep learning into the resource-constrained MPSoC in this paper, where each deep learning network is run with different service levels based on resource usage (where a higher service level implies higher performance with increased resource consumption). To this end, we propose a heuristic-based strategy, Application Wise Level Selector (AWLS), for selecting service levels to maximize the overall performance subject to a given resource bound. AWLS can achieve higher performance within a constrained resource budget under various simulation scenarios. Further, we verify the proposed strategy using an AMD-Xilinx Zynq UltraScale+ XCZU9EG SoC. Using a framework designed to deploy multi-DNN on multi-DPUs (Deep Learning Units), it is proved that an optimal solution is achieved from the algorithm, which obtains the highest performance (Frames Per Second) using the same resource budget. |
---|---|
Bibliography: | ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14 |
ISSN: | 1939-8018 1939-8115 |
DOI: | 10.1007/s11265-023-01881-9 |