Step-size Adaptation Using Exponentiated Gradient Updates

Optimizers like Adam and AdaGrad have been very successful in training large-scale neural networks. Yet, the performance of these methods is heavily dependent on a carefully tuned learning rate schedule. We show that in many large-scale applications, augmenting a given optimizer with an adaptive tun...

Full description

Saved in:
Bibliographic Details
Main Authors Amid, Ehsan, Anil, Rohan, Fifty, Christopher, Warmuth, Manfred K
Format Journal Article
LanguageEnglish
Published 31.01.2022
Subjects
Online AccessGet full text

Cover

Loading…