低资源条件下基于I-vector特征的LSTM递归神经网络语音识别系统
在低资源条件下,由于带标注训练数据较少,搭建的语音识别系统性能往往不甚理想。针对此问题,首先在声学模型上研究了长短时记忆(LSTM)递归神经网络,通过对长序列进行建模来充分挖掘上下文信息,并且引入线性投影层减小模型参数;然后研究了在特征空间中对说话人进行建模的技术,提取出能有效反映说话人和信道信息的身份认证矢量(I-vector);最后将上述研究结合构建了基于I-vector特征的LSTM递归神经网络系统。在Open KWS 2013标准数据集上进行实验,结果表明该技术相比于深度神经网络基线系统有相对10%的字节错误率降低。...
Saved in:
Published in | 计算机应用研究 Vol. 34; no. 2; pp. 392 - 396 |
---|---|
Main Author | |
Format | Journal Article |
Language | Chinese |
Published |
中国科学院电子学研究所传感技术国家重点实验室,北京100190%清华大学电子工程系清华信息科学与技术国家实验室,北京,100084%中国科学院电子学研究所传感技术国家重点实验室,北京,100190
2017
中国科学院大学,北京100190 |
Subjects | |
Online Access | Get full text |
ISSN | 1001-3695 |
DOI | 10.3969/j.issn.1001-3695.2017.02.016 |
Cover
Summary: | 在低资源条件下,由于带标注训练数据较少,搭建的语音识别系统性能往往不甚理想。针对此问题,首先在声学模型上研究了长短时记忆(LSTM)递归神经网络,通过对长序列进行建模来充分挖掘上下文信息,并且引入线性投影层减小模型参数;然后研究了在特征空间中对说话人进行建模的技术,提取出能有效反映说话人和信道信息的身份认证矢量(I-vector);最后将上述研究结合构建了基于I-vector特征的LSTM递归神经网络系统。在Open KWS 2013标准数据集上进行实验,结果表明该技术相比于深度神经网络基线系统有相对10%的字节错误率降低。 |
---|---|
Bibliography: | 51-1196/TP speech recognition; long short term memory(LSTM) ; i-vector Under the condition of low resource, little labeled training data is available and the performance of speech recogni- tion system is not ideal. To solve this problem. First, this paper investigated long short term memory recurrent neural network ( LSTM RNN) for acoustic modeling. It was a powerful tool to model long time series and could make full use of the context in- formation. Linear projection layer reduced the number of model parameters. Then, it explored speaker modeling methods in the feature space, and extracted identity vector (i-vector) which contained the speaker and channel information simultaneously. Finally, it presented a novel system, which combined the LSTM RNN model and i-vector feature. Results on the standard Open KWS 2013 data set show that this technology produces a relative improvement of about 10% in TER over the DNN base-line system. Huang Guangxu, Tian Yao, Kang Jian , Liu Jia , Xia Shanhong( 1. University of Chines |
ISSN: | 1001-3695 |
DOI: | 10.3969/j.issn.1001-3695.2017.02.016 |