基于深度学习神经网络的孤立词语音识别的研究
为了提高语音识别系统性能,研究提出将自编码器深度学习神经网络应用于语音识别中。该网络结构引入贪婪逐层预训练学习算法,通过预训练和微调两个步骤提取出待识别语音信号的本质特征,克服传统多层人工神经网络模型在训练时存在易陷入局部极小值且需要大量标签数据的问题;经过规整网络,将任意长度帧的语音特征参数规整到某一特定帧,输入到分类器中进行语音识别。对反向传播神经网络和自编码神经网络分别进行了仿真实验,结果表明深度学习神经网络识别准确率较传统神经网络有20.0%的提升,是一种优良的语音识别模型。...
Saved in:
Published in | 计算机应用研究 Vol. 32; no. 8; pp. 2289 - 2291 |
---|---|
Main Author | |
Format | Journal Article |
Language | Chinese |
Published |
桂林电子科技大学信息与通信学院,广西桂林,541004
2015
|
Subjects | |
Online Access | Get full text |
ISSN | 1001-3695 |
DOI | 10.3969/j.issn.1001-3695.2015.08.011 |
Cover
Summary: | 为了提高语音识别系统性能,研究提出将自编码器深度学习神经网络应用于语音识别中。该网络结构引入贪婪逐层预训练学习算法,通过预训练和微调两个步骤提取出待识别语音信号的本质特征,克服传统多层人工神经网络模型在训练时存在易陷入局部极小值且需要大量标签数据的问题;经过规整网络,将任意长度帧的语音特征参数规整到某一特定帧,输入到分类器中进行语音识别。对反向传播神经网络和自编码神经网络分别进行了仿真实验,结果表明深度学习神经网络识别准确率较传统神经网络有20.0%的提升,是一种优良的语音识别模型。 |
---|---|
Bibliography: | 51-1196/TP Wang Shanhai, Jing Xinxing, Yang Haiyan (School of lnformation & Communication, Guilin University of Electronic Technology, Guilin Guangxi 541004, China) speech recognition ; artificial neural networks ; deep learning ; autoencoder; alignment networks To improve the performance of the conventional speech recognition system, this paper introduced the autoencoder deep learning neural networks which was applied to speech recognition. The neural networks based on deep learning introduced greedy layer-wise learning algorithm by pretraining and fine-tuning. It could extract the essential features of speech signal which was needed to recognition. It could overcome the shortcomings of the conventional muhilayer artificial neural networks which easily trapped into local optimum when training the model. And they needed a large number of labeled data. Then the structured alignment networks could align arbitrary frames of features to fixed frames. And it input these features to a classifier to speech recognition. |
ISSN: | 1001-3695 |
DOI: | 10.3969/j.issn.1001-3695.2015.08.011 |