Sparse coding for speech recognition

This paper proposes a novel feature extraction technique for speech recognition based on the principles of sparse coding. The idea is to express a spectro-temporal pattern of speech as a linear combination of an overcomplete set of basis functions such that the weights of the linear combination are...

Full description

Saved in:

Bibliographic Details
Published in	2010 IEEE International Conference on Acoustics, Speech and Signal Processing pp. 4346 - 4349
Main Authors	Sivaram, G S V S, Nemala, S K, Elhilali, M, Tran, T D, Hermansky, H
Format	Conference Proceeding
Language	English
Published	IEEE 01.03.2010
Subjects	Automatic speech recognition compressive sensing Dictionaries Feature extraction Gabor filters Image reconstruction Iterative algorithms Neurons sparse coding Speech recognition Training data Vectors
Online Access	Get full text

Cover

Loading…

More Information
Summary:	This paper proposes a novel feature extraction technique for speech recognition based on the principles of sparse coding. The idea is to express a spectro-temporal pattern of speech as a linear combination of an overcomplete set of basis functions such that the weights of the linear combination are sparse. These weights (features) are subsequently used for acoustic modeling. We learn a set of overcomplete basis functions (dictionary) from the training set by adopting a previously proposed algorithm which iteratively minimizes the reconstruction error and maximizes the sparsity of weights. Furthermore, features are derived using the learned basis functions by applying the well established principles of compressive sensing. Phoneme recognition experiments show that the proposed features outperform the conventional features in both clean and noisy conditions.
ISBN:	9781424442959 1424442958
ISSN:	1520-6149 2379-190X
DOI:	10.1109/ICASSP.2010.5495649