Recurrent out-of-vocabulary word detection based on distribution of features
•A novel method for robustly detecting out-of-vocabulary (OOV) words is proposed.•The method focuses on the consistency of recurrent OOV words.•The degree of consistency is measured by distribution of features.•The proposed method achieves over 60% relative reduction in false alarms. The repeated us...
Saved in:
Published in | Computer Speech & Language Vol. 58; pp. 247 - 259 |
---|---|
Main Authors | , , , |
Format | Journal Article |
Language | English Japanese |
Published |
Elsevier Ltd
01.11.2019
Elsevier BV |
Subjects | |
Online Access | Get full text |
ISSN | 0885-2308 1095-8363 |
DOI | 10.1016/j.csl.2019.04.007 |
Cover
Summary: | •A novel method for robustly detecting out-of-vocabulary (OOV) words is proposed.•The method focuses on the consistency of recurrent OOV words.•The degree of consistency is measured by distribution of features.•The proposed method achieves over 60% relative reduction in false alarms.
The repeated use of out-of-vocabulary (OOV) words in a spoken document seriously degrades a speech recognizer performance. Even though such recurrent OOV words are often important keywords in a spoken document, they are never correctly recognized. We propose a novel method for robustly detecting recurrent OOV words, which focuses on the degree of consistency among them. It first detects recurrent segments, that is recurrent phoneme sub-sequence in the output of a phoneme sequence decoder. Then, we measure the degree of consistency by using the mean and variance (distribution) of features (DOF) derived from the recurrent segments, and use our DOF for IV/OOV classification. Experiments on academic lectures illustrate that the proposed DOF-based method can robustly detect recurrent OOV words in spontaneous speech and achieves over 60% relative reduction in false alarms. It is also confirmed that detection performance improves as the OOV words are repeated more often. |
---|---|
ISSN: | 0885-2308 1095-8363 |
DOI: | 10.1016/j.csl.2019.04.007 |