To Ensemble or Not Ensemble: When Does End-to-End Training Fail?

End-to-End training (E2E) is becoming more and more popular to train complex Deep Network architectures. An interesting question is whether this trend will continue—are there any clear failure cases for E2E training? We study this question in depth, for the specific case of E2E training an ensemble...

Full description

Saved in:

Bibliographic Details
Published in	Machine Learning and Knowledge Discovery in Databases Vol. 12459; pp. 109 - 123
Main Authors	Webb, Andrew, Reynolds, Charles, Chen, Wenlin, Reeve, Henry, Iliescu, Dan, Luján, Mikel, Brown, Gavin
Format	Book Chapter
Language	English
Published	Switzerland Springer International Publishing AG 2021 Springer International Publishing
Series	Lecture Notes in Computer Science
Online Access	Get full text

Cover

Loading…

More Information
Summary:	End-to-End training (E2E) is becoming more and more popular to train complex Deep Network architectures. An interesting question is whether this trend will continue—are there any clear failure cases for E2E training? We study this question in depth, for the specific case of E2E training an ensemble of networks. Our strategy is to blend the gradient smoothly in between two extremes: from independent training of the networks, up to to full E2E training. We find clear failure cases, where overparameterized models cannot be trained E2E. A surprising result is that the optimum can sometimes lie in between the two, neither an ensemble or an E2E system. The work also uncovers links to Dropout, and raises questions around the nature of ensemble diversity and multi-branch networks.
Bibliography:	Electronic supplementary materialThe online version of this chapter (https://doi.org/10.1007/978-3-030-67664-3_7) contains supplementary material, which is available to authorized users.
ISBN:	3030676633 9783030676636
ISSN:	0302-9743 1611-3349
DOI:	10.1007/978-3-030-67664-3_7