Dynamic Cues in Vowel Classification: A Discriminant Analysis of Conversational Speech Corpus

This paper asks whether the vowel inherent spectral change (VISC) or the dynamic cues of vowels is an essential feature for vowel classification in natural speech. To answer this question, vowels from the Buckeye Corpus of conversational speech were trained and tested for three models on vowel class...

Full description

Saved in:

Bibliographic Details
Published in	Korea Journal of English Language and Linguistics Vol. 25; pp. 289 - 310
Main Author	Hwangbo, Hyun Jin
Format	Journal Article
Language	English
Published	한국영어학회 01.03.2025
Subjects	영어와문학
Online Access	Get full text

Cover

Loading…

More Information
Summary:	This paper asks whether the vowel inherent spectral change (VISC) or the dynamic cues of vowels is an essential feature for vowel classification in natural speech. To answer this question, vowels from the Buckeye Corpus of conversational speech were trained and tested for three models on vowel classification with quadratic discriminant analysis, a machine learning technique. Three models were evaluated: the steady-state model, the one-point model, and two trajectory models, which include the two-point and three-point models. The one-point model samples the spectral features of vowels at one point of vowel duration, while the two-point and three-point models sample the features at two and three points of vowel duration. Various combinations of sampled points and predictors (F0, F1, F2, and F3) were analyzed, and the combinations with the best classification accuracy were compared across the models. The results showed that the steady-state model showed the highest classification accuracy when the spectral features and fundamental frequency were sampled at 50% of vowel duration, while the trajectory models showed the highest classification when sampled at 30% and 70% and 10%, 50%, and 90% for two-point and three-point models, respectively. Classification performance was the highest for all models when all parameters (F0, F1, F2, F3) were included across all models. When compared across the models, the trajectory models perform better than the steady-state model. In addition, vowel duration as a parameter has facilitated the classification accuracy for specific vowels. This paper obtains additional evidence for VISC in vowel classification, including detailed classification results of each vowel, identifying the misclassified vowels, and providing insights for vowel classification models. KCI Citation Count: 0
ISSN:	1598-1398 2586-7474
DOI:	10.15738/kjell.25..202503.289