Automatic speech classification to five emotional states based on gender information

Emotional speech recognition aims to automatically classify speech units (e.g., utterances) into emotional states, such as anger, happiness, neutral, sadness and surprise. The major contribution of this paper is to rate the discriminating capability of a set of features for emotional speech recognit...

Full description

Saved in:

Bibliographic Details
Published in	2004 12th European Signal Processing Conference : 6-10 September 2004 pp. 341 - 344
Main Authors	Ververidis, Dimitrios, Kotropoulos, Constantine
Format	Conference Proceeding Journal Article
Language	English
Published	IEEE 01.09.2004
Subjects	Abstracts Approximation Bayesian analysis Classification Classifiers Gaussian Probability density functions Speech Speech recognition
Online Access	Get full text
ISBN	9783200001657 3200001658

Cover

More Information
Summary:	Emotional speech recognition aims to automatically classify speech units (e.g., utterances) into emotional states, such as anger, happiness, neutral, sadness and surprise. The major contribution of this paper is to rate the discriminating capability of a set of features for emotional speech recognition when gender information is taken into consideration. A total of 87 features has been calculated over 500 utterances of the Danish Emotional Speech database. The Sequential Forward Selection method (SFS) has been used in order to discover the 5-10 features which are able to classify the samples in the best way for each gender. The criterion used in SFS is the crossvalidated correct classification rate of a Bayes classifier where the class probability distribution functions (pdfs) are approximated via Parzen windows or modeled as Gaussians. When a Bayes classifier with Gaussian pdfs is employed, a correct classification rate of 61.1% is obtained for male subjects and a corresponding rate of 57.1% for female ones. In the same experiment, a random Emotional speech recognition aims to automatically classify speech units (e.g., utterances) into emotional states, such as anger, happiness, neutral, sadness and surprise. The major contribution of this paper is to rate the discriminating capability of a set of features for emotional speech recognition when gender information is taken into consideration. A total of 87 features has been calculated over 500 utterances of the Danish Emotional Speech database. The Sequential Forward Selection method (SFS) has been used in order to discover the 5-10 features which are able to classify the samples in the best way for each gender. The criterion used in SFS is the crossvalidated correct classification rate of a Bayes classifier where the class probability distribution functions (pdfs) are approximated via Parzen windows or modeled as Gaussians. When a Bayes classifier with Gaussian PDFs is employed, a correct classification rate of 61.1% is obtained for male subjects and a corresponding rate of 57.1% for female ones. In the same experiment, a random classification would result in a correct classification rate of 20%. When gender information is not considered a correct classification score of 50.6% is obtained.classification would result in a correct classification rate of 20%. When gender information is not considered a correct classification score of 50.6% is obtained.
Bibliography:	ObjectType-Article-2 SourceType-Scholarly Journals-1 ObjectType-Conference-1 ObjectType-Feature-3 content type line 23 SourceType-Conference Papers & Proceedings-2
ISBN:	9783200001657 3200001658