Advances in all-neural speech recognition

This paper advances the design of CTC-based all-neural (or end-to-end) speech recognizers. We propose a novel symbol inventory, and a novel iterated-CTC method in which a second system is used to transform a noisy initial output into a cleaner version. We present a number of stabilization and initia...

Full description

Saved in:

Bibliographic Details
Published in	2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) pp. 4805 - 4809
Main Authors	Zweig, Geoffrey, Chengzhu Yu, Droppo, Jasha, Stolcke, Andreas
Format	Conference Proceeding
Language	English
Published	IEEE 01.03.2017
Subjects	Acoustics CTC Decoding end-to-end training Hidden Markov models Neural networks recurrent neural network Speech recognition Training
Online Access	Get full text

Cover

Loading…

More Information
Summary:	This paper advances the design of CTC-based all-neural (or end-to-end) speech recognizers. We propose a novel symbol inventory, and a novel iterated-CTC method in which a second system is used to transform a noisy initial output into a cleaner version. We present a number of stabilization and initialization methods we have found useful in training these networks. We evaluate our system on the commonly used NIST 2000 conversational telephony test set, and significantly exceed the previously published performance of similar systems, both with and without the use of an external language model and decoding technology.
ISSN:	2379-190X
DOI:	10.1109/ICASSP.2017.7953069