Private emotions versus social interaction: a data-driven approach towards analysing emotion in speech

The ‘traditional’ first two dimensions in emotion research are VALENCE and AROUSAL. Normally, they are obtained by using elicited, acted data. In this paper, we use realistic, spontaneous speech data from our ‘AIBO’ corpus (human-robot communication, children interacting with Sony’s AIBO robot). The...

Full description

Saved in:
Bibliographic Details
Published inUser modeling and user-adapted interaction Vol. 18; no. 1-2; pp. 175 - 206
Main Authors Batliner, Anton, Steidl, Stefan, Hacker, Christian, Nöth, Elmar
Format Journal Article
LanguageEnglish
Published Dordrecht Springer Netherlands 01.02.2008
Springer Nature B.V
Subjects
Online AccessGet full text
ISSN0924-1868
1573-1391
DOI10.1007/s11257-007-9039-4

Cover

More Information
Summary:The ‘traditional’ first two dimensions in emotion research are VALENCE and AROUSAL. Normally, they are obtained by using elicited, acted data. In this paper, we use realistic, spontaneous speech data from our ‘AIBO’ corpus (human-robot communication, children interacting with Sony’s AIBO robot). The recordings were done in a Wizard-of-Oz scenario: the children believed that AIBO obeys their commands; in fact, AIBO followed a fixed script and often disobeyed. Five labellers annotated each word as belonging to one of eleven emotion-related states; seven of these states which occurred frequently enough are dealt with in this paper. The confusion matrices of these labels were used in a Non-Metrical Multi-dimensional Scaling to display two dimensions; the first we interpret as VALENCE, the second, however, not as AROUSAL but as INTERACTION, i.e., addressing oneself ( angry, joyful ) or the communication partner ( motherese, reprimanding ). We show that it depends on the specifity of the scenario and on the subjects’ conceptualizations whether this new dimension can be observed, and discuss impacts on the practice of labelling and processing emotional data. Two-dimensional solutions based on acoustic and linguistic features that were used for automatic classification of these emotional states are interpreted along the same lines.
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
ISSN:0924-1868
1573-1391
DOI:10.1007/s11257-007-9039-4