Framewise phoneme classification with bidirectional LSTM networks

In this paper, we apply bidirectional training to a long short term memory (LSTM) network for the first time. We also present a modified, full gradient version of the LSTM learning algorithm. We discuss the significance of framewise phoneme classification to continuous speech recognition, and the va...

Full description

Saved in:

Bibliographic Details
Published in	Proceedings. 2005 IEEE International Joint Conference on Neural Networks, 2005 Vol. 4; pp. 2047 - 2052 vol. 4
Main Authors	Graves, A., Schmidhuber, J.
Format	Conference Proceeding
Language	English
Published	IEEE 2005
Subjects	Acoustic measurements Data analysis Electronic mail Error correction Hidden Markov models Memory architecture Neural networks Recurrent neural networks Speech recognition
Online Access	Get full text

Cover

Loading…

More Information
Summary:	In this paper, we apply bidirectional training to a long short term memory (LSTM) network for the first time. We also present a modified, full gradient version of the LSTM learning algorithm. We discuss the significance of framewise phoneme classification to continuous speech recognition, and the validity of using bidirectional networks for online causal tasks. On the TIMIT speech database, we measure the framewise phoneme classification scores of bidirectional and unidirectional variants of both LSTM and conventional recurrent neural networks (RNNs). We find that bidirectional LSTM outperforms both RNNs and unidirectional LSTM.
ISBN:	0780390482 9780780390485
ISSN:	2161-4393
DOI:	10.1109/IJCNN.2005.1556215