N-gram Language Model Based on Multi-Word Expressions in Web Documents for Speech Recognition and Closed-Captioning

Automatic speech recognition technique is generally used to align the closed caption text to video data. It is important to increase the speech recognition accuracy for the accurate closed-captioning. This paper proposes the method for constructing N-gram language model based on multi word expressio...

Full description

Saved in:

Bibliographic Details
Published in	2012 International Conference on Asian Language Processing (IALP) pp. 225 - 228
Main Authors	Takahashi, S., Morimoto, T.
Format	Conference Proceeding
Language	English
Published	IEEE 01.11.2012
Subjects	Adaptation models closed-captioning Computational modeling language model N-gram Probability Speech Speech recognition Training Vocabulary web documents
Online Access	Get full text
ISBN	9781467361132 1467361135
DOI	10.1109/IALP.2012.55

Cover

More Information
Summary:	Automatic speech recognition technique is generally used to align the closed caption text to video data. It is important to increase the speech recognition accuracy for the accurate closed-captioning. This paper proposes the method for constructing N-gram language model based on multi word expressions (MWEs) from web retrieval results to improve the speech recognition performance. The web retrieval experiment for examining the distribution of web count numbers for MWEs and the speech recognition experiment for investigating the effectiveness of MWEs are conducted. The experimental results show that the proposed method can improve the recognition performance and the closed-captioning accuracy.
ISBN:	9781467361132 1467361135
DOI:	10.1109/IALP.2012.55