N-gram Language Model Based on Multi-Word Expressions in Web Documents for Speech Recognition and Closed-Captioning

Automatic speech recognition technique is generally used to align the closed caption text to video data. It is important to increase the speech recognition accuracy for the accurate closed-captioning. This paper proposes the method for constructing N-gram language model based on multi word expressio...

Full description

Saved in:
Bibliographic Details
Published in2012 International Conference on Asian Language Processing (IALP) pp. 225 - 228
Main Authors Takahashi, S., Morimoto, T.
Format Conference Proceeding
LanguageEnglish
Published IEEE 01.11.2012
Subjects
Online AccessGet full text
ISBN9781467361132
1467361135
DOI10.1109/IALP.2012.55

Cover

More Information
Summary:Automatic speech recognition technique is generally used to align the closed caption text to video data. It is important to increase the speech recognition accuracy for the accurate closed-captioning. This paper proposes the method for constructing N-gram language model based on multi word expressions (MWEs) from web retrieval results to improve the speech recognition performance. The web retrieval experiment for examining the distribution of web count numbers for MWEs and the speech recognition experiment for investigating the effectiveness of MWEs are conducted. The experimental results show that the proposed method can improve the recognition performance and the closed-captioning accuracy.
ISBN:9781467361132
1467361135
DOI:10.1109/IALP.2012.55