Gradient Deconfliction-Based Training For Multi-Exit Architectures

Muiti-exit architectures, in which a sequence of intermediate classifiers are introduced at different depths of the feature layers, perform adaptive computation by early exiting "easy" samples to speed up the inference. In this paper, we propose a new gradient deconfliction-based training...

Full description

Saved in:
Bibliographic Details
Published in2020 IEEE International Conference on Image Processing (ICIP) pp. 1866 - 1870
Main Authors Wang, Xinglu, Li, Yingming
Format Conference Proceeding
LanguageEnglish
Published IEEE 01.10.2020
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:Muiti-exit architectures, in which a sequence of intermediate classifiers are introduced at different depths of the feature layers, perform adaptive computation by early exiting "easy" samples to speed up the inference. In this paper, we propose a new gradient deconfliction-based training technique for multi-exit architectures. In particular, the conflicting between the gradients back-propagated from different classifiers is removed by projecting the gradient from one classifier onto the normal plane of the gradient from the other classifier. Experiments on CFAR-100 and ImageNet show that the gradient deconfliction-based training strategy significantly improves the performance of the state-of-the-art multi-exit neural networks. Moreover, this method does not require within architecture modifications and can be effectively combined with other previously-proposed training techniques and further boosts the performance.
ISSN:2381-8549
DOI:10.1109/ICIP40778.2020.9190812