Dynamically throttleable neural networks
Conditional computation for deep neural networks reduces overall computational load and improves model accuracy by running a subset of the network. In this work, we present a runtime dynamically throttleable neural network (DTNN) that can self-regulate its own performance target and computing resour...
Saved in:
Published in | Machine vision and applications Vol. 33; no. 4 |
---|---|
Main Authors | , , , , |
Format | Journal Article |
Language | English |
Published |
Berlin/Heidelberg
Springer Berlin Heidelberg
01.07.2022
Springer Nature B.V |
Subjects | |
Online Access | Get full text |
Cover
Loading…
Summary: | Conditional computation for deep neural networks reduces overall computational load and improves model accuracy by running a subset of the network. In this work, we present a runtime dynamically throttleable neural network (DTNN) that can self-regulate its own performance target and computing resources by dynamically activating neurons in response to a single control signal, called
utilization
. We describe a generic formulation of throttleable neural networks (TNNs) by grouping and gating partial neural modules with various gating strategies. To directly optimize arbitrary application-level performance metrics and model complexity, a controller network is trained separately to predict a context-aware utilization via deep contextual bandits. Extensive experiments and comparisons on image classification and object detection tasks show that TNNs can be effectively throttled across a wide range of utilization settings, while having peak accuracy and lower cost that are comparable to corresponding vanilla architectures such as VGG, ResNet, ResNeXt, and DenseNet. We further demonstrate the effectiveness of the controller network on throttleable 3D convolutional networks (C3D) for video-based hand gesture recognition, which outperforms the vanilla C3D and all fixed utilization settings. |
---|---|
ISSN: | 0932-8092 1432-1769 |
DOI: | 10.1007/s00138-022-01311-z |