DRL-based Multi-Stream Scheduling of Inference Pipelines on Edge Devices

Real-time scheduling of multiple neural network-based inference pipelines on Graphics Processing Unit (GPU) based edge devices is an active area of research nowadays. Applications like Advanced Driver-Assistance Systems (ADAS) execute multiple such inference pipelines to make informed decisions on d...

Full description

Saved in:

Bibliographic Details
Published in	2024 37th International Conference on VLSI Design and 2024 23rd International Conference on Embedded Systems (VLSID) pp. 324 - 329
Main Authors	Pereria, Danny, Ghosh, Sumana, Dey, Soumyajit
Format	Conference Proceeding
Language	English
Published	IEEE 06.01.2024
Subjects	Deep reinforcement learning Edge Device Embedded systems GPU Graphics processing units Performance evaluation Pipelines Processor scheduling Real-Time Scheduling Very large scale integration
Online Access	Get full text

Cover

Loading…

More Information
Summary:	Real-time scheduling of multiple neural network-based inference pipelines on Graphics Processing Unit (GPU) based edge devices is an active area of research nowadays. Applications like Advanced Driver-Assistance Systems (ADAS) execute multiple such inference pipelines to make informed decisions on driving scenarios. The real-time performance of ADAS is often limited by platform resource limitations and thus incurs execution latency which ultimately leads to deadline violation. In this regard, modern GPUs provide support for concurrent execution of multiple compute streams. However, there is a lack of scheduling strategies in the literature that consider multiple such compute streams and focus on the concurrent execution of inference pipelines for more efficient real-time scheduling. In this paper, we address this issue by proposing a Deep Reinforcement Learning (DRL) based solution for multi-stream scheduling of inference pipelines on edge GPUs. Using DRL, we learn how to map every layer of the target inference pipelines to high or low-priority streams, while satisfying task-level deadline requirements. The experimental evaluation shows the efficacy of the proposed approach as compared to some baseline approaches.
ISSN:	2380-6923
DOI:	10.1109/VLSID60093.2024.00060