Efficient BackProp

The convergence of back-propagation learning is analyzed so as to explain common phenomenon observed by practitioners. Many undesirable behaviors of backprop can be avoided with tricks that are rarely exposed in serious technical publications. This paper gives some of those tricks, and offers explan...

Full description

Saved in:

Bibliographic Details
Published in	Neural Networks: Tricks of the Trade pp. 9 - 48
Main Authors	LeCun, Yann A., Bottou, Léon, Orr, Genevieve B., Müller, Klaus-Robert
Format	Book Chapter
Language	English
Published	Berlin, Heidelberg Springer Berlin Heidelberg 2012
Series	Lecture Notes in Computer Science
Subjects	Conjugate Gradient Gradient Descent Handwritten Digit Neural Information Processing System Newton Algorithm
Online Access	Get full text

Cover

Loading…

More Information
Summary:	The convergence of back-propagation learning is analyzed so as to explain common phenomenon observed by practitioners. Many undesirable behaviors of backprop can be avoided with tricks that are rarely exposed in serious technical publications. This paper gives some of those tricks, and offers explanations of why they work. Many authors have suggested that second-order optimization methods are advantageous for neural net training. It is shown that most “classical” second-order methods are impractical for large neural networks. A few methods are proposed that do not have these limitations.
Bibliography:	Previously published in: Orr, G.B. and Müller, K.-R. (Eds.): LNCS 1524, ISBN 978-3-540-65311-0 (1998).
ISBN:	9783642352881 364235288X
ISSN:	0302-9743 1611-3349
DOI:	10.1007/978-3-642-35289-8_3