MOSAIC: Heterogeneity-, Communication-, and Constraint-Aware Model Slicing and Execution for Accurate and Efficient Inference

Heterogeneous embedded systems have surfaced as a promising solution for accurate and efficient deep-learning inference on mobile devices. Despite extensive prior works, it still remains unexplored to investigate the system-software support that efficiently executes inference workloads by judiciousl...

Full description

Saved in:
Bibliographic Details
Published inProceedings / International Conference on Parallel Architectures and Compilation Techniques pp. 165 - 177
Main Authors Han, Myeonggyun, Hyun, Jihoon, Park, Seongbeom, Park, Jinsu, Baek, Woongki
Format Conference Proceeding
LanguageEnglish
Published IEEE 01.09.2019
Subjects
Online AccessGet full text
ISSN2641-7936
DOI10.1109/PACT.2019.00021

Cover

More Information
Summary:Heterogeneous embedded systems have surfaced as a promising solution for accurate and efficient deep-learning inference on mobile devices. Despite extensive prior works, it still remains unexplored to investigate the system-software support that efficiently executes inference workloads by judiciously considering their performance and energy heterogeneity, communication overheads, and constraints. To bridge this gap, we propose MOSAIC, heterogeneity-, communication-, and constraint-aware model slicing and execution for accurate and efficient inference on heterogeneous embedded systems. MOSAIC generates the efficient model slicing and execution plan for the target inference workload through dynamic programming. MOSAIC significantly reduces inference latency and energy, exhibits high estimation accuracy, and incurs small overheads.
ISSN:2641-7936
DOI:10.1109/PACT.2019.00021