Private emotions versus social interaction: a data-driven approach towards analysing emotion in speech

The ‘traditional’ first two dimensions in emotion research are VALENCE and AROUSAL. Normally, they are obtained by using elicited, acted data. In this paper, we use realistic, spontaneous speech data from our ‘AIBO’ corpus (human-robot communication, children interacting with Sony’s AIBO robot). The...

Full description

Saved in:

Bibliographic Details
Published in	User modeling and user-adapted interaction Vol. 18; no. 1-2; pp. 175 - 206
Main Authors	Batliner, Anton, Steidl, Stefan, Hacker, Christian, Nöth, Elmar
Format	Journal Article
Language	English
Published	Dordrecht Springer Netherlands 01.02.2008 Springer Nature B.V
Subjects	Affect (Psychology) Classification Communication Computer Science Emotions Interpersonal communication Management of Computing and Information Systems Multimedia Information Systems Original Paper Realism Social interaction Speech Speeches Statistical analysis Studies User Interfaces and Human Computer Interaction Categories Non-metrical multi-dimensional scaling Speech Dimensions Emotion Data-driven Annotation
Online Access	Get full text
ISSN	0924-1868 1573-1391
DOI	10.1007/s11257-007-9039-4

Cover

More Information
Summary:	The ‘traditional’ first two dimensions in emotion research are VALENCE and AROUSAL. Normally, they are obtained by using elicited, acted data. In this paper, we use realistic, spontaneous speech data from our ‘AIBO’ corpus (human-robot communication, children interacting with Sony’s AIBO robot). The recordings were done in a Wizard-of-Oz scenario: the children believed that AIBO obeys their commands; in fact, AIBO followed a fixed script and often disobeyed. Five labellers annotated each word as belonging to one of eleven emotion-related states; seven of these states which occurred frequently enough are dealt with in this paper. The confusion matrices of these labels were used in a Non-Metrical Multi-dimensional Scaling to display two dimensions; the first we interpret as VALENCE, the second, however, not as AROUSAL but as INTERACTION, i.e., addressing oneself ( angry, joyful ) or the communication partner ( motherese, reprimanding ). We show that it depends on the specifity of the scenario and on the subjects’ conceptualizations whether this new dimension can be observed, and discuss impacts on the practice of labelling and processing emotional data. Two-dimensional solutions based on acoustic and linguistic features that were used for automatic classification of these emotional states are interpreted along the same lines.
Bibliography:	ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14
ISSN:	0924-1868 1573-1391
DOI:	10.1007/s11257-007-9039-4