Step-size Adaptation Using Exponentiated Gradient Updates

Optimizers like Adam and AdaGrad have been very successful in training large-scale neural networks. Yet, the performance of these methods is heavily dependent on a carefully tuned learning rate schedule. We show that in many large-scale applications, augmenting a given optimizer with an adaptive tun...

Full description

Saved in:

Bibliographic Details
Main Authors	Amid, Ehsan, Anil, Rohan, Fifty, Christopher, Warmuth, Manfred K
Format	Journal Article
Language	English
Published	31.01.2022
Subjects	Computer Science - Learning
Online Access	Get full text

Cover

Loading…

Be the first to leave a comment!