Gated Orthogonal Recurrent Units: On Learning to Forget

We present a novel recurrent neural network (RNN)–based model that combines the remembering ability of unitary evolution RNNs with the ability of gated RNNs to effectively forget redundant or irrelevant information in its memory. We achieve this by extending restricted orthogonal evolution RNNs with...

Full description

Saved in:

Bibliographic Details
Published in	Neural computation Vol. 31; no. 4; pp. 765 - 783
Main Authors	Jing, Li, Gulcehre, Caglar, Peurifoy, John, Shen, Yichen, Tegmark, Max, Soljacic, Marin, Bengio, Yoshua
Format	Journal Article
Language	English
Published	One Rogers Street, Cambridge, MA 02142-1209, USA MIT Press 01.04.2019 MIT Press Journals, The
Subjects	Copying Dependence Evolution Noise reduction Recurrent neural networks
Online Access	Get full text

Cover

Loading…

More Information
Summary:	We present a novel recurrent neural network (RNN)–based model that combines the remembering ability of unitary evolution RNNs with the ability of gated RNNs to effectively forget redundant or irrelevant information in its memory. We achieve this by extending restricted orthogonal evolution RNNs with a gating mechanism similar to gated recurrent unit RNNs with a reset gate and an update gate. Our model is able to outperform long short-term memory, gated recurrent units, and vanilla unitary or orthogonal RNNs on several long-term-dependency benchmark tasks. We empirically show that both orthogonal and unitary RNNs lack the ability to forget. This ability plays an important role in RNNs. We provide competitive results along with an analysis of our model on many natural sequential tasks, including question answering, speech spectrum prediction, character-level language modeling, and synthetic tasks that involve long-term dependencies such as algorithmic, denoising, and copying tasks.
Bibliography:	April, 2019 SourceType-Scholarly Journals-1 ObjectType-Correspondence-1 content type line 14 ObjectType-Article-2 ObjectType-Undefined-1 ObjectType-Feature-3 content type line 23
ISSN:	0899-7667 1530-888X 1530-888X
DOI:	10.1162/neco_a_01174