N-gram Language Model Based on Multi-Word Expressions in Web Documents for Speech Recognition and Closed-Captioning
Automatic speech recognition technique is generally used to align the closed caption text to video data. It is important to increase the speech recognition accuracy for the accurate closed-captioning. This paper proposes the method for constructing N-gram language model based on multi word expressio...
Saved in:
Published in | 2012 International Conference on Asian Language Processing (IALP) pp. 225 - 228 |
---|---|
Main Authors | , |
Format | Conference Proceeding |
Language | English |
Published |
IEEE
01.11.2012
|
Subjects | |
Online Access | Get full text |
ISBN | 9781467361132 1467361135 |
DOI | 10.1109/IALP.2012.55 |
Cover
Summary: | Automatic speech recognition technique is generally used to align the closed caption text to video data. It is important to increase the speech recognition accuracy for the accurate closed-captioning. This paper proposes the method for constructing N-gram language model based on multi word expressions (MWEs) from web retrieval results to improve the speech recognition performance. The web retrieval experiment for examining the distribution of web count numbers for MWEs and the speech recognition experiment for investigating the effectiveness of MWEs are conducted. The experimental results show that the proposed method can improve the recognition performance and the closed-captioning accuracy. |
---|---|
ISBN: | 9781467361132 1467361135 |
DOI: | 10.1109/IALP.2012.55 |