New results on recurrent network training: unifying the algorithms and accelerating convergence

How to efficiently train recurrent networks remains a challenging and active research topic. Most of the proposed training approaches are based on computational ways to efficiently obtain the gradient of the error function, and can be generally grouped into five major groups. In this study we presen...

Full description

Saved in:

Bibliographic Details
Published in	IEEE transactions on neural networks Vol. 11; no. 3; pp. 697 - 709
Main Authors	Atiya, A.F., Parlos, A.G.
Format	Journal Article
Language	English
Published	United States IEEE 01.05.2000
Subjects	Acceleration Algorithms Approximation algorithms Backpropagation algorithms Computation Computational complexity Convergence Differential equations Error correction Errors Mathematical analysis Networks Nonlinear dynamical systems On-line systems Online Optimal control Time factors Training
Online Access	Get full text

Cover

Loading…

More Information
Summary:	How to efficiently train recurrent networks remains a challenging and active research topic. Most of the proposed training approaches are based on computational ways to efficiently obtain the gradient of the error function, and can be generally grouped into five major groups. In this study we present a derivation that unifies these approaches. We demonstrate that the approaches are only five different ways of solving a particular matrix equation. The second goal of this paper is develop a new algorithm based on the insights gained from the novel formulation. The new algorithm, which is based on approximating the error gradient, has lower computational complexity in computing the weight update than the competing techniques for most typical problems. In addition, it reaches the error minimum in a much smaller number of iterations. A desirable characteristic of recurrent network training algorithms is to be able to update the weights in an online fashion. We have also developed an online version of the proposed algorithm, that is based on updating the error gradient approximation in a recursive manner.
Bibliography:	ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 23 ObjectType-Article-2 ObjectType-Feature-1
ISSN:	1045-9227 1941-0093
DOI:	10.1109/72.846741