Approximating vision transformers for edge: variational inference and mixed-precision for multi-modal data Approximating vision transformers for edge: variational inference
Vision transformer (ViTs) models have shown higher accuracy, robustness and large volume data processing ability, creating new baselines and references for perception tasks. However, these advantages require large memory and high-performance processors and computing units, which makes model adaptabi...
Saved in:
Published in | Computing Vol. 107; no. 3 |
---|---|
Main Authors | , , , |
Format | Journal Article |
Language | English |
Published |
Vienna
Springer Vienna
01.03.2025
|
Subjects | |
Online Access | Get full text |
ISSN | 0010-485X 1436-5057 |
DOI | 10.1007/s00607-025-01427-w |
Cover
Loading…
Summary: | Vision transformer (ViTs) models have shown higher accuracy, robustness and large volume data processing ability, creating new baselines and references for perception tasks. However, these advantages require large memory and high-performance processors and computing units, which makes model adaptability and deployment challenging within resource-constrained environments such as memory-restricted and battery-powered edge devices. This paper addresses the model deployment challenges by proposing a model approximation approach
VI-ViT
, for edge deployment using variational inference with mixed precision for processing multi-modalities, such as point clouds and images. Our experimental evaluation on the nuScenes and Waymo datasets show up to 37% and 31% reduction in model parameters and Flops while maintaining a mean average precision of 70.5 compared to 74.8 of the baseline model. This work presents a practical deployment approach for approximating and optimizing Vision Transformers for edge AI applications by balancing model metrics such as parameters, flops, latency, energy consumption, and accuracy, which can easily be adapted to other transformer models and datasets. |
---|---|
ISSN: | 0010-485X 1436-5057 |
DOI: | 10.1007/s00607-025-01427-w |