Step-size Adaptation Using Exponentiated Gradient Updates
Optimizers like Adam and AdaGrad have been very successful in training large-scale neural networks. Yet, the performance of these methods is heavily dependent on a carefully tuned learning rate schedule. We show that in many large-scale applications, augmenting a given optimizer with an adaptive tun...
Saved in:
Main Authors | , , , |
---|---|
Format | Journal Article |
Language | English |
Published |
31.01.2022
|
Subjects | |
Online Access | Get full text |
Cover
Loading…
Be the first to leave a comment!