Nested LSTMs
Proceedings of the Ninth Asian Conference on Machine Learning, PMLR 77:530-544, 2017 We propose Nested LSTMs (NLSTM), a novel RNN architecture with multiple levels of memory. Nested LSTMs add depth to LSTMs via nesting as opposed to stacking. The value of a memory cell in an NLSTM is computed by an...
Saved in:
Main Authors | , |
---|---|
Format | Journal Article |
Language | English |
Published |
31.01.2018
|
Subjects | |
Online Access | Get full text |
Cover
Loading…
Summary: | Proceedings of the Ninth Asian Conference on Machine Learning,
PMLR 77:530-544, 2017 We propose Nested LSTMs (NLSTM), a novel RNN architecture with multiple
levels of memory. Nested LSTMs add depth to LSTMs via nesting as opposed to
stacking. The value of a memory cell in an NLSTM is computed by an LSTM cell,
which has its own inner memory cell. Specifically, instead of computing the
value of the (outer) memory cell as $c^{outer}_t = f_t \odot c_{t-1} + i_t
\odot g_t$, NLSTM memory cells use the concatenation $(f_t \odot c_{t-1}, i_t
\odot g_t)$ as input to an inner LSTM (or NLSTM) memory cell, and set
$c^{outer}_t$ = $h^{inner}_t$. Nested LSTMs outperform both stacked and
single-layer LSTMs with similar numbers of parameters in our experiments on
various character-level language modeling tasks, and the inner memories of an
LSTM learn longer term dependencies compared with the higher-level units of a
stacked LSTM. |
---|---|
DOI: | 10.48550/arxiv.1801.10308 |