Continuous density Hidden Markov Model for context dependent Hindi speech recognition

With the advancement in technology and the inherent advantage of voice based communication due to its variability, speed and convenience has driven attention towards mechanical recognition of speech. Literature survey of research in this area shows that almost every system uses Gaussian Mixture Hidd...

Full description

Saved in:

Bibliographic Details
Published in	2013 International Conference on Advances in Computing, Communications and Informatics (ICACCI) pp. 1953 - 1958
Main Authors	Sinha, Shweta, Agrawal, S. S., Jain, Aruna
Format	Conference Proceeding
Language	English
Published	IEEE 01.08.2013
Subjects	Feature extraction GMM Hidden Markov models Hindi speech Recognition HLDA Mel frequency cepstral coefficient MFCC PLP Speech Speech recognition Vectors
Online Access	Get full text

Cover

Loading…

More Information
Summary:	With the advancement in technology and the inherent advantage of voice based communication due to its variability, speed and convenience has driven attention towards mechanical recognition of speech. Literature survey of research in this area shows that almost every system uses Gaussian Mixture Hidden Markov model for optimal performance in recognition of speech. Evaluation of Gaussian likelihood dominates the total computational load in using this statistical approach. The appropriate selection of Gaussian mixture is very important. Current choice of mixture component is arbitrary with little justification. Also the standard set for European languages can not be used in Hindi speech recognition due to mismatch in database size of the languages. Parameter estimation with too many or few component may inappropriately estimate the mixture model. Therefore, number of mixture is important for expectation maximization process. In this research work, the authors estimate number of Gaussian mixture component for Hindi database based upon the size of vocabulary. MFCC and PLP features along with its extended version has been used as speech feature. HLDA is applied for feature reduction while using extended features.
ISBN:	9781479924325 1479924326
DOI:	10.1109/ICACCI.2013.6637481