A simple statistical speech recognition of mandarin monosyllables

Each mandarin syllable is represented by a sequence of vectors of linear predict coding cepstra (LPCC). Since all syllables have a simple phonetic structure, in our speech recognition, we partition the sequence of LPCC vectors of all syllables into equal segments and average the LPCC vectors in each...

Full description

Saved in:
Bibliographic Details
Published inApplied mathematics and computation Vol. 177; no. 2; pp. 644 - 651
Main Authors Li, Tze Fen, Chang, Shui-Ching, Lee, Chung-Bow
Format Journal Article
LanguageEnglish
Published New York, NY Elsevier Inc 15.06.2006
Elsevier
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:Each mandarin syllable is represented by a sequence of vectors of linear predict coding cepstra (LPCC). Since all syllables have a simple phonetic structure, in our speech recognition, we partition the sequence of LPCC vectors of all syllables into equal segments and average the LPCC vectors in each segment. The mean vector of LPCC is used as the feature of a syllable. Our simple feature does not need any time consuming and complicated nonlinear contraction and expansion as adopted by the dynamic time-warping. We propose several probability distributions for the feature values. A simplified Bayes decision rule is used for classification of mandarin syllables. For the speaker-independent mandarin digits, the recognition rate is 98.6% if a normal distribution is used for feature values and the rate is 98.1% if an exponential distribution is used for the absolute values of the features. The feature proposed in this paper to represent a syllable is the simplest one, much easier to be extracted than any other known features. The computation for feature extraction and classification is much faster and more accurate than using the HMM method or any other known techniques.
ISSN:0096-3003
1873-5649
DOI:10.1016/j.amc.2005.09.094