Recent Advances in Stochastic Gradient Descent in Deep Learning

In the age of artificial intelligence, the best approach to handling huge amounts of data is a tremendously motivating and hard problem. Among machine learning models, stochastic gradient descent (SGD) is not only simple but also very effective. This study provides a detailed analysis of contemporar...

Full description

Saved in:

Bibliographic Details
Published in	Mathematics (Basel) Vol. 11; no. 3; p. 682
Main Authors	Tian, Yingjie, Zhang, Yuqi, Zhang, Haibin
Format	Journal Article
Language	English
Published	Basel MDPI AG 01.01.2023
Subjects	Algorithms Analysis Artificial intelligence Audio data Computers Data processing Datasets Deep learning Distance learning Food science Language Machine learning Methods Natural language processing Neural networks Optimization algorithms Speech stochastic gradient descent Stochastic processes China
Online Access	Get full text

Cover

Loading…

More Information
Summary:	In the age of artificial intelligence, the best approach to handling huge amounts of data is a tremendously motivating and hard problem. Among machine learning models, stochastic gradient descent (SGD) is not only simple but also very effective. This study provides a detailed analysis of contemporary state-of-the-art deep learning applications, such as natural language processing (NLP), visual data processing, and voice and audio processing. Following that, this study introduces several versions of SGD and its variant, which are already in the PyTorch optimizer, including SGD, Adagrad, adadelta, RMSprop, Adam, AdamW, and so on. Finally, we propose theoretical conditions under which these methods are applicable and discover that there is still a gap between theoretical conditions under which the algorithms converge and practical applications, and how to bridge this gap is a question for the future.
Bibliography:	ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14
ISSN:	2227-7390 2227-7390
DOI:	10.3390/math11030682