Speech-based recognition of self-reported and observed emotion in a dimensional space

► Exploration of the use of self-reported emotion ratings for automatic affect recognition. ► Better recognition performance is obtained with observed emotion ratings than self-reported ratings. ► Averaging emotion ratings from multiple annotators improves performance. ► Valence is better recognized...

Full description

Saved in:
Bibliographic Details
Published inSpeech communication Vol. 54; no. 9; pp. 1049 - 1063
Main Authors Truong, Khiet P., van Leeuwen, David A., de Jong, Franciska M.G.
Format Journal Article
LanguageEnglish
Published Elsevier B.V 01.11.2012
Subjects
Online AccessGet full text

Cover

Loading…
Abstract ► Exploration of the use of self-reported emotion ratings for automatic affect recognition. ► Better recognition performance is obtained with observed emotion ratings than self-reported ratings. ► Averaging emotion ratings from multiple annotators improves performance. ► Valence is better recognized with lexical than acoustic features. The differences between self-reported and observed emotion have only marginally been investigated in the context of speech-based automatic emotion recognition. We address this issue by comparing self-reported emotion ratings to observed emotion ratings and look at how differences between these two types of ratings affect the development and performance of automatic emotion recognizers developed with these ratings. A dimensional approach to emotion modeling is adopted: the ratings are based on continuous arousal and valence scales. We describe the TNO-Gaming Corpus that contains spontaneous vocal and facial expressions elicited via a multiplayer videogame and that includes emotion annotations obtained via self-report and observation by outside observers. Comparisons show that there are discrepancies between self-reported and observed emotion ratings which are also reflected in the performance of the emotion recognizers developed. Using Support Vector Regression in combination with acoustic and textual features, recognizers of arousal and valence are developed that can predict points in a 2-dimensional arousal-valence space. The results of these recognizers show that the self-reported emotion is much harder to recognize than the observed emotion, and that averaging ratings from multiple observers improves performance.
AbstractList ► Exploration of the use of self-reported emotion ratings for automatic affect recognition. ► Better recognition performance is obtained with observed emotion ratings than self-reported ratings. ► Averaging emotion ratings from multiple annotators improves performance. ► Valence is better recognized with lexical than acoustic features. The differences between self-reported and observed emotion have only marginally been investigated in the context of speech-based automatic emotion recognition. We address this issue by comparing self-reported emotion ratings to observed emotion ratings and look at how differences between these two types of ratings affect the development and performance of automatic emotion recognizers developed with these ratings. A dimensional approach to emotion modeling is adopted: the ratings are based on continuous arousal and valence scales. We describe the TNO-Gaming Corpus that contains spontaneous vocal and facial expressions elicited via a multiplayer videogame and that includes emotion annotations obtained via self-report and observation by outside observers. Comparisons show that there are discrepancies between self-reported and observed emotion ratings which are also reflected in the performance of the emotion recognizers developed. Using Support Vector Regression in combination with acoustic and textual features, recognizers of arousal and valence are developed that can predict points in a 2-dimensional arousal-valence space. The results of these recognizers show that the self-reported emotion is much harder to recognize than the observed emotion, and that averaging ratings from multiple observers improves performance.
The differences between self-reported and observed emotion have only marginally been investigated in the context of speech-based automatic emotion recognition. We address this issue by comparing self-reported emotion ratings to observed emotion ratings and look at how differences between these two types of ratings affect the development and performance of automatic emotion recognizers developed with these ratings. A dimensional approach to emotion modeling is adopted: the ratings are based on continuous arousal and valence scales. We describe the TNO-Gaming Corpus that contains spontaneous vocal and facial expressions elicited via a multiplayer videogame and that includes emotion annotations obtained via self-report and observation by outside observers. Comparisons show that there are discrepancies between self-reported and observed emotion ratings which are also reflected in the performance of the emotion recognizers developed. Using Support Vector Regression in combination with acoustic and textual features, recognizers of arousal and valence are developed that can predict points in a 2-dimensional arousal-valence space. The results of these recognizers show that the self-reported emotion is much harder to recognize than the observed emotion, and that averaging ratings from multiple observers improves performance. [Copyright Elsevier B.V.]
The differences between self-reported and observed emotion have only marginally been investigated in the context of speech-based automatic emotion recognition. We address this issue by comparing self-reported emotion ratings to observed emotion ratings and look at how differences between these two types of ratings affect the development and performance of automatic emotion recognizers developed with these ratings. A dimensional approach to emotion modeling is adopted: the ratings are based on continuous arousal and valence scales. We describe the TNO-Gaming Corpus that contains spontaneous vocal and facial expressions elicited via a multiplayer videogame and that includes emotion annotations obtained via self-report and observation by outside observers. Comparisons show that there are discrepancies between self-reported and observed emotion ratings which are also reflected in the performance of the emotion recognizers developed. Using Support Vector Regression in combination with acoustic and textual features, recognizers of arousal and valence are developed that can predict points in a 2-dimensional arousal-valence space. The results of these recognizers show that the self-reported emotion is much harder to recognize than the observed emotion, and that averaging ratings from multiple observers improves performance.
Author de Jong, Franciska M.G.
Truong, Khiet P.
van Leeuwen, David A.
Author_xml – sequence: 1
  givenname: Khiet P.
  surname: Truong
  fullname: Truong, Khiet P.
  email: k.p.truong@utwente.nl
  organization: University of Twente, Human Media Interaction, P.O. Box 217, 7500 AE Enschede, The Netherlands
– sequence: 2
  givenname: David A.
  surname: van Leeuwen
  fullname: van Leeuwen, David A.
  email: d.vanleeuwen@let.ru.nl
  organization: Radboud University Nijmegen, Centre for Language and Speech Technology, P.O. Box 9103, 6500 HD Nijmegen, The Netherlands
– sequence: 3
  givenname: Franciska M.G.
  surname: de Jong
  fullname: de Jong, Franciska M.G.
  email: f.m.g.dejong@utwente.nl
  organization: University of Twente, Human Media Interaction, P.O. Box 217, 7500 AE Enschede, The Netherlands
BookMark eNqNkU1LxDAQhoMouH78Aw89emlNptkmvQiy-AWCB_Uc0nSiWdqkJl3Bf2_W9ayehmGe9z3Mc0T2ffBIyBmjFaOsuVhXaUITxgoog4ryitJmjyyYFFAKJmGfLDImyqZu60NylNKaUsqlhAV5eZoQzVvZ6YR9EXPLq3ezC74Itkg42DLiFOKcj9r3RegSxo-84Bi-KecLXfRuRJ_yqociTdrgCTmwekh4-jOPycvN9fPqrnx4vL1fXT2UZsnEXNYUatkZ4AJtB1S32HY1660BKjpmAbSVnHGNXSuFsf2yBbsUttcgTCeErI_J-a53iuF9g2lWo0sGh0F7DJuk2LJuWCMpwN8olQBA25b_B2Uyg7zNKN-hJoaUIlo1RTfq-JkhtXWj1mrnRm3dKMpVdpNjl7sY5ud8OIwqGYfeYO-ygln1wf1e8AWYq5tr
CODEN SCOMDH
CitedBy_id crossref_primary_10_3389_fnhum_2014_00144
crossref_primary_10_1007_s10579_019_09450_y
crossref_primary_10_1080_01691864_2019_1667872
crossref_primary_10_1016_j_specom_2013_04_005
crossref_primary_10_1016_j_jnca_2019_102423
crossref_primary_10_1093_llc_fqu041
crossref_primary_10_1250_ast_36_370
crossref_primary_10_1109_TCSS_2021_3130401
crossref_primary_10_1007_s42761_023_00199_w
crossref_primary_10_1016_j_csl_2018_02_002
crossref_primary_10_1038_s41597_020_00630_y
crossref_primary_10_17798_bitlisfen_1079499
crossref_primary_10_1007_s10919_017_0268_x
crossref_primary_10_3390_s21041249
crossref_primary_10_1007_s12193_013_0129_9
crossref_primary_10_1007_s12559_014_9296_6
crossref_primary_10_1016_j_specom_2019_12_001
crossref_primary_10_1080_10447318_2019_1688985
crossref_primary_10_1109_JPROC_2023_3276209
crossref_primary_10_1016_j_actpsy_2022_103713
crossref_primary_10_3390_electronics10101163
crossref_primary_10_1111_sjop_12429
crossref_primary_10_1109_TAFFC_2022_3155604
Cites_doi 10.1007/BFb0026683
10.1016/j.specom.2007.01.010
10.21437/Interspeech.2008-95
10.1109/ICME.2005.1521551
10.1109/ICASSP.2009.4959521
10.1037/0003-066X.50.5.372
10.1109/FG.2011.5771357
10.1007/11821830_23
10.21437/Interspeech.2005-380
10.1109/ICASSP.2003.1202279
10.1016/0306-4573(88)90021-0
10.1023/B:STCO.0000035301.49549.88
10.1037/h0077714
10.1121/1.1913238
10.21437/Interspeech.2005-700
10.21437/ICSLP.2002-559
10.21437/Eurospeech.2003-80
10.21437/Interspeech.2008-192
10.1037/0022-3514.70.3.614
10.1037/1528-3542.5.4.513
10.21437/Interspeech.2009-583
10.1109/79.911197
10.1016/j.neunet.2005.03.007
10.21437/Interspeech.2005-381
10.1109/ICME.2003.1221370
10.21437/Interspeech.2009-474
10.21437/ICSLP.1996-462
10.21437/Eurospeech.2003-306
10.21437/Interspeech.2004-327
10.1109/T-AFFC.2011.9
10.1109/ICASSP.2007.367262
10.1016/S0167-6393(02)00071-7
10.1016/j.specom.2006.04.003
10.21437/Interspeech.2009-471
10.1162/pres.15.4.381
10.1007/978-3-540-85853-9_15
10.1007/s12193-009-0032-6
10.21437/Interspeech.2008-92
10.1016/S0167-6393(03)00099-2
10.1037/h0054570
10.1109/TMM.2004.840618
10.21437/ICSLP.2002-557
10.1109/ICME.2005.1521717
ContentType Journal Article
Copyright 2012 Elsevier B.V.
Copyright_xml – notice: 2012 Elsevier B.V.
DBID AAYXX
CITATION
7T9
7SC
7SP
8FD
JQ2
L7M
L~C
L~D
8BM
DOI 10.1016/j.specom.2012.04.006
DatabaseName CrossRef
Linguistics and Language Behavior Abstracts (LLBA)
Computer and Information Systems Abstracts
Electronics & Communications Abstracts
Technology Research Database
ProQuest Computer Science Collection
Advanced Technologies Database with Aerospace
Computer and Information Systems Abstracts – Academic
Computer and Information Systems Abstracts Professional
ComDisDome
DatabaseTitle CrossRef
Linguistics and Language Behavior Abstracts (LLBA)
Technology Research Database
Computer and Information Systems Abstracts – Academic
Electronics & Communications Abstracts
ProQuest Computer Science Collection
Computer and Information Systems Abstracts
Advanced Technologies Database with Aerospace
Computer and Information Systems Abstracts Professional
ComDisDome
DatabaseTitleList
Linguistics and Language Behavior Abstracts (LLBA)
ComDisDome
Technology Research Database
DeliveryMethod fulltext_linktorsrc
Discipline Languages & Literatures
Social Welfare & Social Work
Psychology
EISSN 1872-7182
EndPage 1063
ExternalDocumentID 10_1016_j_specom_2012_04_006
S0167639312000507
GroupedDBID --K
--M
-~X
.DC
.~1
07C
0R~
123
1B1
1~.
1~5
4.4
457
4G.
53G
5VS
7-5
71M
8P~
9JN
9JO
AACTN
AADFP
AAEDT
AAEDW
AAFJI
AAGJA
AAGJQ
AAGUQ
AAIAV
AAIKJ
AAKOC
AALRI
AAOAW
AAQFI
AAQXK
AAXUO
AAYFN
ABBOA
ABFNM
ABIVO
ABJNI
ABMAC
ABMMH
ABOYX
ABXDB
ABYKQ
ACDAQ
ACGFS
ACNNM
ACRLP
ACXNI
ACZNC
ADBBV
ADEZE
ADIYS
ADJOM
ADMUD
ADTZH
AEBSH
AECPX
AEKER
AENEX
AFKWA
AFTJW
AFYLN
AGHFR
AGUBO
AGYEJ
AHHHB
AHJVU
AHZHX
AIALX
AIEXJ
AIKHN
AITUG
AJBFU
AJOXV
AKYCK
ALMA_UNASSIGNED_HOLDINGS
AMFUW
AMRAJ
AOMHK
AOUOD
ASPBG
AVARZ
AVWKF
AXJTR
AZFZN
BJAXD
BKOJK
BLXMC
CS3
DU5
EBS
EFJIC
EFLBG
EJD
EO8
EO9
EP2
EP3
F0J
F5P
FDB
FEDTE
FGOYB
FIRID
FNPLU
FYGXN
G-Q
G8K
GBLVA
GBOLZ
HLZ
HVGLF
HZ~
IHE
J1W
JJJVA
KOM
LG9
M41
MO0
N9A
O-L
O9-
OAUVE
OKEIE
OZT
P-8
P-9
P2P
PC.
PQQKQ
PRBVW
Q38
R2-
RIG
ROL
RPZ
SBC
SDF
SDG
SDP
SES
SEW
SPC
SPCBC
SSB
SSO
SST
SSV
SSY
SSZ
T5K
WUQ
XFK
XJE
~G-
AAXKI
AAYXX
AFJKZ
AKRWK
CITATION
7T9
7SC
7SP
8FD
JQ2
L7M
L~C
L~D
8BM
ID FETCH-LOGICAL-c517t-30238bc247efb20a9e9b31dfc207b1f22af8414aeb987cfd592f57fda27cb7783
IEDL.DBID .~1
ISSN 0167-6393
IngestDate Fri Oct 25 22:06:18 EDT 2024
Fri Oct 25 12:21:42 EDT 2024
Fri Oct 25 00:10:57 EDT 2024
Thu Sep 26 15:18:10 EDT 2024
Fri Feb 23 02:28:32 EST 2024
IsDoiOpenAccess false
IsOpenAccess true
IsPeerReviewed true
IsScholarly true
Issue 9
Keywords Affective computing
Emotion database
Emotion perception
Emotion annotation
Support Vector Regression
Automatic emotion recognition
Emotion elicitation
Emotional speech
Audiovisual database
Videogames
Language English
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-c517t-30238bc247efb20a9e9b31dfc207b1f22af8414aeb987cfd592f57fda27cb7783
Notes ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 23
ObjectType-Article-2
ObjectType-Feature-1
OpenAccessLink https://ris.utwente.nl/ws/files/6781115/final-s2.0-S0167639312000507-main.pdf
PQID 1081899449
PQPubID 23478
PageCount 15
ParticipantIDs proquest_miscellaneous_1536168022
proquest_miscellaneous_1082220994
proquest_miscellaneous_1081899449
crossref_primary_10_1016_j_specom_2012_04_006
elsevier_sciencedirect_doi_10_1016_j_specom_2012_04_006
PublicationCentury 2000
PublicationDate November 2012
2012-11-00
20121101
PublicationDateYYYYMMDD 2012-11-01
PublicationDate_xml – month: 11
  year: 2012
  text: November 2012
PublicationDecade 2010
PublicationTitle Speech communication
PublicationYear 2012
Publisher Elsevier B.V
Publisher_xml – name: Elsevier B.V
References Ekman, Friesen (b0095) 1975
Petrushin, V.A., 1999. Emotion in speech: recognition and application to call centers. In: Proc. 1999 Conf. on Artificial Neural Networks in Engineering (ANNIE’99).
Wöllmer, M., Eyben, F., Schuller, B., Douglas-Cowie, E., Cowie, R., 2009. Data-driven clustering in emotional space for affect recognition using discriminatively trained lstm networks. In: Proc. Interspeech, pp. 1595–1598.
Joachims, T., 1998. Text categorization with support vector machines: learning with many relevant features. In: Proc. 10th European Conf. on Machine Learning (ECML-98), pp. 137–142.
Ververidis, D., Kotropoulos, C., 2005. Emotional speech classification using gaussian mixture models and the sequential floating forward selection algorithm. In: Proc. IEEE Internat. Conf. on Multimedia and Expo (ICME 2005), pp. 1500 –1503.
Tato, R., Santos, R., Kompe, R., Pardo, J.M., 2002. Emotional space improves emotion recognition. In: Proc. Internat. Conf. on Spoken Language Processing (ICSLP 2002), pp. 2029–2032.
Kim, J., André, E., Rehm, M., Vogt, T., Wagner, J., 2005. Integrating information from speech and physiological signals to achieve emotional sensitivity. In: Proc. Interspeech, pp. 809–812.
Nicolaou, Gunes, Pantic (b0180) 2011; 2
Lang (b0150) 1995; 50
Giannakopoulos, T., Pikrakis, A., Theodoridis, S., 2009. A dimensional approach to emotion recognition of speech from movies. In: Proc. IEEE Internat. Conf. on Acoustics, Speech and Signal Processing (ICASSP’09), pp. 65–68.
Wöllmer, M., Eyben, F., Reiter, S., Schuller, B., Cox, C., Douglas-Cowie, E., Cowie, R., 2008. Abandoning emotion classes – towards continuous emotion recognition with modeling of long-range dependencies. In: Proc. Interspeech, pp. 597–600.
Truong, K.P., Raaijmakers, S., 2008. Automatic recognition of spontaneous emotions in speech using acoustic and lexical features. In: Proc. Fifth Joint Workshop on Machine Learning and Multimodal Interaction (MLMI 2008), pp. 161–172.
Schuller, B., Rigoll, G., Lang, M., 2003. Hidden Markov model-based speech emotion recognition. In: Proc. IEEE Internat. Conf. on Acoustics, Speech, and Signal Processing (ICASSP 2003), pp. 1–4.
Truong, K.P., Van Leeuwen, D.A., Neerincx, M.A., De Jong, F.M.G., 2009. Arousal and valence prediction in spontaneous emotional speech: felt versus perceived emotion. In: Proc. Interspeech, pp. 2027–2030.
Wang, N., Marsella, S., 2006. Introducing EVG: an emotion evoking game. In: Proc. Internat. Conf. on Interactive Virtual Agents (IVA 2006), pp. 282 – 291.
Zeng, Z., Zhang, Z., Pianfetti, B., Tu, J., Huang, T.S., 2005. Audio-visual affect recognition in activation-evaluation space. In: Proc. IEEE Internat. Conf. on Multimedia and Expo (ICME’05), pp. 828–831.
Nicolaou, M.A., Gunes, H., Pantic, M., 2010. Automatic segmentation of spontaneous data using dimensional labels from multiple coders. In: Proc. Internat. Workship on Multimodal Corpora: Advances in Capturing, Coding and Analyzing Multimodality, pp. 43–48.
Lee, C.M., Narayanan, S., Pieraccini, R., 2002. Classifying emotions in human-machine spoken dialogs. In: Proc. IEEE Internat. Conf. on Multimedia and Expo (ICME ’02), pp. 737–740.
Nwe, Foo, De Silva (b0185) 2003; 41
Gunes, H., Schuller, B., Pantic, M., Cowie, R., 2011. Emotion representation, analysis and synthesis in continuous space: a survey. In: Proc. IEEE Internat. Conf. on Automatic Face & Gesture Recognition and Workshops (FG2011), pp. 827–834.
Banse, Scherer (b0015) 1996; 70
Busso, C., Narayanan, S.S., 2008. The expression and perception of emotions: comparing assessments of self versus others. In: Proc. Interspeech 2008, pp. 257–260.
Salton, Buckley (b0210) 1988; 24
Grimm, Kroschel, Mower, Narayanan (b0110) 2007; 49
Schlosberg (b0220) 1954; 61
Auberge, V., Audibert, N., Rilliard, A., 2006. Auto-annotation: an alternative method to label expressive corpora. In: Proc. Fifth Internat. Conf. on Language Resources and Evaluation (LREC 2006).
.
Grimm, M., Kroschel, K., Narayanan, S., 2007b. Support Vector Regression for automatic recognition of spontaneous emotions in speech. In: Proc. IEEE Internat. Conf. on Acoustics, Speech, and Signal Processing (ICASSP 2007), pp. 1085–1088.
Yu, C., Aoki, P., Woodruff, A., 2004. Detecting user engagement in everyday conversations. In: Proc. Interspeech, pp. 1329–1332.
Batliner, A., Fischer, K., Huber, R., Spilker, J., Nöth, E., 2000. Desperately seeking emotions: actors, wizards, and human beings. In: Cowie, R., Douglas-Cowie, E., Schröder, M. (Eds.), In: Proc. ISCA Workshop on Speech and Emotion: A Conceptual Framework for Research, pp. 195–200.
Biersack, S., Kempe, V., 2005. Tracing vocal expression of emotion along the speech chain: do listeners perceive what speakers feel? In: Proc. ISCA Workshop on Plasticity in Speech Perception (PSP2005), pp. 211–214.
Johnstone, van Reekum, Hird, Kirsner, Scherer (b0135) 2005; 5
Batliner, A., Steidl, S., Schuller, B., Seppi, D., Laskowski, K., Vogt, T., Devillers, L., Vidrascu, L., Amir, N., Kessous, L., Aharonson, V., 2006. Combining efforts for improving automatic classification of emotional user states. In: Language Technologies (IS-LTC), pp. 240–245.
Devillers, L., Lamel, L., Vasilescu, I., 2003. Emotion detection in task-oriented spoken dialogues. In: Proc. IEEE Internat. Conf. on Multimedia and Expo (ICME’03), pp. 549–552.
Liscombe, J., Venditti, J., Hirschberg, J., 2003. Classifying subject ratings of emotional speech using acoustic features. In: Proc. Eurospeech, pp. 725–728.
Douglas-Cowie, E., Devillers, L., Martin, J.-C., Cowie, R., Davvidou, S., Abrillian, S., Cox, C., 2005. Multimodal databases of everyday emotion: facing up to complexity. In: Proc. Interspeech 2005, pp. 813–816.
Boersma, P., Weenink, D., 2009. Praat: doing phonetics by computer (Version 5.1.07). [Computer Program].
Cowie, R., Douglas-Cowie, E., Savvidou, S., McMahon, E., Sawey, M., Schroeder, M., 2000. FEELTRACE: an instrument for recording perceived emotion in real time. In: Proc. ISCA ITRW on Speech and Emotion, pp. 19–24.
Ekman, P., 1972. Universals and cultural differences in facial expressions of emotion. In: Cole, J. (Ed.), Nebraska Symposium on Motivation, pp. 207–283.
Williams, Stevens (b0275) 1972; 52
Russell (b0205) 1980; 39
Cowie, Cornelius (b0050) 2003; 40
Retrieved 16.06.09.
Smola, Scholkopf (b0230) 2004; 14
Vapnik (b0255) 2002
Cowie, Douglas-Cowie, Tsapatsoulis, Votsis, Kollias, Fellenz, Taylor (b0060) 2001; 18
Ververidis, Kotropoulos (b0265) 2006; 48
Polzin, T., Waibel, A., 1998. Detecting emotions in speech. In: Proc. Cooperative Multimodal Communication (CMC’98).
Kwon, O.-W., Chan, K., Hao, J., Lee, T.-W., 2003. Emotion recognition by speech signals. In: Proc. Eurospeech, pp. 125–128.
Lazarro, N., 2004. Why We Play Games: Four Keys to More Emotion Without Story.
Ang, J., Dhillon, R., Krupski, A., Shriberg, E., Stolcke, A., 2002. Prosody-based automatic detection of annoyance and frustration in human-computer dialog. In: Proceedings of the International Conference on Spoken Language Processing (ICSLP 2002), pp. 2037–2040.
Yildirim, S., Lee, C.M., Lee, S., Potamianos, A., Narayanan, S.S., 2005. Detecting politeness and frustration state of a child in a conversational computer game. In: Proc. Interspeech 2005, pp. 2209–2212.
Dellaert, F., Polzin, T., Waibel, A., 1996. Recognizing emotion in speech. In: Proc. Internat. Conf. on Spoken Language Processing (ICSLP 1996), pp. 1970–1973.
Ravaja, Saari, Turpeinen, Laarni, Salminen, Kivikangas (b0200) 2006; 15
Hanjalic, Xu (b0125) 2005; 7
Scherer, K.R., 2010. The component process model: architecture for a comprehensive computational model of emergent emotion. In: Scherer, K.R., Bänziger, T., Roesch, E. (Eds.), Blueprint for Affective Computing: A Sourcebook, pp. 47–70.
Truong, K.P., Neerincx, M.A., Van Leeuwen, D.A., 2008. Assessing agreement of observer- and self-annotations in spontaneous multimodal emotion data. In: Proc. Interspeech 2008, pp. 381–321.
Mower, E., Mataric, M.J., Narayanan, S.S., 2009. Evaluating evaluators: a case study in understanding the benefits and pitfalls of multi-evaluator modeling. In: Proc. Interspeech 2009, pp. 1583–1586.
Wundt (b0290) 1874/1905
Eyben, Wöllmer, Graves, Schuller, Douglas-Cowie, Cowie (b0100) 2010; 3
Den Uyl, M.J., Van Kuilenburg, H., 2005. The facereader: online facial expression recognition. In: Proc. Measuring Behavior, pp. 589–590.
Devillers, Vidrascu, Lamel (b0080) 2005; 18
Chang, C.-C., Lin, C.-J., 2001. LIBSVM: a library for support vector machines.
Hanjalic (10.1016/j.specom.2012.04.006_b0125) 2005; 7
Williams (10.1016/j.specom.2012.04.006_b0275) 1972; 52
Lang (10.1016/j.specom.2012.04.006_b0150) 1995; 50
10.1016/j.specom.2012.04.006_b0215
10.1016/j.specom.2012.04.006_b0010
10.1016/j.specom.2012.04.006_b0175
10.1016/j.specom.2012.04.006_b0055
Schlosberg (10.1016/j.specom.2012.04.006_b0220) 1954; 61
Nicolaou (10.1016/j.specom.2012.04.006_b0180) 2011; 2
Salton (10.1016/j.specom.2012.04.006_b0210) 1988; 24
Cowie (10.1016/j.specom.2012.04.006_b0060) 2001; 18
Ravaja (10.1016/j.specom.2012.04.006_b0200) 2006; 15
10.1016/j.specom.2012.04.006_b0250
Ververidis (10.1016/j.specom.2012.04.006_b0265) 2006; 48
Devillers (10.1016/j.specom.2012.04.006_b0080) 2005; 18
10.1016/j.specom.2012.04.006_b0130
10.1016/j.specom.2012.04.006_b0295
10.1016/j.specom.2012.04.006_b0090
Nwe (10.1016/j.specom.2012.04.006_b0185) 2003; 41
10.1016/j.specom.2012.04.006_b0170
Ekman (10.1016/j.specom.2012.04.006_b0095) 1975
Eyben (10.1016/j.specom.2012.04.006_b0100) 2010; 3
Smola (10.1016/j.specom.2012.04.006_b0230) 2004; 14
10.1016/j.specom.2012.04.006_b0305
10.1016/j.specom.2012.04.006_b0025
10.1016/j.specom.2012.04.006_b0300
10.1016/j.specom.2012.04.006_b0225
10.1016/j.specom.2012.04.006_b0105
10.1016/j.specom.2012.04.006_b0065
Cowie (10.1016/j.specom.2012.04.006_b0050) 2003; 40
10.1016/j.specom.2012.04.006_b0145
Grimm (10.1016/j.specom.2012.04.006_b0110) 2007; 49
10.1016/j.specom.2012.04.006_b0260
10.1016/j.specom.2012.04.006_b0140
10.1016/j.specom.2012.04.006_b0020
10.1016/j.specom.2012.04.006_b0235
10.1016/j.specom.2012.04.006_b0115
10.1016/j.specom.2012.04.006_b0155
10.1016/j.specom.2012.04.006_b0035
Banse (10.1016/j.specom.2012.04.006_b0015) 1996; 70
10.1016/j.specom.2012.04.006_b0270
10.1016/j.specom.2012.04.006_b0030
10.1016/j.specom.2012.04.006_b0195
10.1016/j.specom.2012.04.006_b0075
10.1016/j.specom.2012.04.006_b0190
10.1016/j.specom.2012.04.006_b0070
Johnstone (10.1016/j.specom.2012.04.006_b0135) 2005; 5
10.1016/j.specom.2012.04.006_b0245
10.1016/j.specom.2012.04.006_b0005
10.1016/j.specom.2012.04.006_b0120
Vapnik (10.1016/j.specom.2012.04.006_b0255) 2002
10.1016/j.specom.2012.04.006_b0285
10.1016/j.specom.2012.04.006_b0165
10.1016/j.specom.2012.04.006_b0045
Wundt (10.1016/j.specom.2012.04.006_b0290) 1874
10.1016/j.specom.2012.04.006_b0160
10.1016/j.specom.2012.04.006_b0040
10.1016/j.specom.2012.04.006_b0085
10.1016/j.specom.2012.04.006_b0240
Russell (10.1016/j.specom.2012.04.006_b0205) 1980; 39
10.1016/j.specom.2012.04.006_b0280
References_xml – year: 1975
  ident: b0095
  article-title: Unmasking the Face: A Guide to Recognizing Emotions from Facial Expressions
  contributor:
    fullname: Friesen
– volume: 15
  start-page: 381
  year: 2006
  end-page: 392
  ident: b0200
  article-title: Spatial presence and emotions during video game playing: does it matter with whom you play?
  publication-title: Presence: Teleoper. Virtual Environ.
  contributor:
    fullname: Kivikangas
– volume: 49
  start-page: 787
  year: 2007
  end-page: 800
  ident: b0110
  article-title: Primitives-based evaluation and estimation of emotions in speech
  publication-title: Speech Commun.
  contributor:
    fullname: Narayanan
– volume: 50
  start-page: 371
  year: 1995
  end-page: 385
  ident: b0150
  article-title: The emotion probe
  publication-title: Amer. Psychol.
  contributor:
    fullname: Lang
– volume: 14
  start-page: 199
  year: 2004
  end-page: 222
  ident: b0230
  article-title: A tutorial on support vector regression
  publication-title: Stat. Comput.
  contributor:
    fullname: Scholkopf
– volume: 41
  start-page: 603
  year: 2003
  end-page: 623
  ident: b0185
  publication-title: Speech Commun.
  contributor:
    fullname: De Silva
– year: 2002
  ident: b0255
  article-title: The Nature of Statistical Learning Theory
  contributor:
    fullname: Vapnik
– volume: 5
  start-page: 513
  year: 2005
  end-page: 518
  ident: b0135
  article-title: Affective speech elicited with a computer game
  publication-title: Emotion
  contributor:
    fullname: Scherer
– volume: 48
  start-page: 1162
  year: 2006
  end-page: 1181
  ident: b0265
  article-title: Emotional speech recognition: resources, features, and methods
  publication-title: Speech Commun.
  contributor:
    fullname: Kotropoulos
– volume: 18
  start-page: 407
  year: 2005
  end-page: 422
  ident: b0080
  article-title: Challenges in real-life emotion annotation and machine learning based detection
  publication-title: Neural Networks
  contributor:
    fullname: Lamel
– volume: 40
  start-page: 5
  year: 2003
  end-page: 32
  ident: b0050
  article-title: Describing the emotional states that are expressed in speech
  publication-title: Speech Commun.
  contributor:
    fullname: Cornelius
– volume: 7
  start-page: 143
  year: 2005
  end-page: 154
  ident: b0125
  article-title: Affective video content representation and modeling
  publication-title: IEEE Trans. Multimedia
  contributor:
    fullname: Xu
– volume: 3
  start-page: 7
  year: 2010
  end-page: 19
  ident: b0100
  article-title: On-line emotion recognition in a 3-d activation-valence-time continuum using acoustic and linguistic cues
  publication-title: J. Multimodal User Interf.
  contributor:
    fullname: Cowie
– volume: 39
  start-page: 1161
  year: 1980
  end-page: 1178
  ident: b0205
  article-title: A circumplex model of affect
  publication-title: J. Pers. Soc. Psychol.
  contributor:
    fullname: Russell
– volume: 70
  start-page: 614
  year: 1996
  end-page: 636
  ident: b0015
  article-title: Acoustic profiles in vocal emotion expression
  publication-title: J. Pers. Soc. Psychol.
  contributor:
    fullname: Scherer
– volume: 52
  start-page: 1238
  year: 1972
  end-page: 1250
  ident: b0275
  article-title: Emotions and speech: some acoustical correlates
  publication-title: J. Acoust. Soc. Amer.
  contributor:
    fullname: Stevens
– volume: 24
  start-page: 513
  year: 1988
  end-page: 523
  ident: b0210
  article-title: Term-weighting approaches in automatic text retrieval
  publication-title: Inform. Process. Manage.
  contributor:
    fullname: Buckley
– volume: 61
  start-page: 81
  year: 1954
  end-page: 88
  ident: b0220
  publication-title: Psychol. Rev.
  contributor:
    fullname: Schlosberg
– volume: 2
  start-page: 92
  year: 2011
  end-page: 105
  ident: b0180
  article-title: Continuous prediction of spontaneous affect from multiple cues and modalities in valence-arousal space
  publication-title: IEEE Trans. Affect. Comput.
  contributor:
    fullname: Pantic
– year: 1874/1905
  ident: b0290
  article-title: Grundzüge der physiologischen Psychologie, [Fundamentals of pshysiological psychology]
  contributor:
    fullname: Wundt
– volume: 18
  start-page: 32
  year: 2001
  end-page: 80
  ident: b0060
  article-title: Emotion recognition in human-computer interaction
  publication-title: IEEE Signal Process. Mag.
  contributor:
    fullname: Taylor
– ident: 10.1016/j.specom.2012.04.006_b0130
  doi: 10.1007/BFb0026683
– volume: 49
  start-page: 787
  year: 2007
  ident: 10.1016/j.specom.2012.04.006_b0110
  article-title: Primitives-based evaluation and estimation of emotions in speech
  publication-title: Speech Commun.
  doi: 10.1016/j.specom.2007.01.010
  contributor:
    fullname: Grimm
– ident: 10.1016/j.specom.2012.04.006_b0240
  doi: 10.21437/Interspeech.2008-95
– ident: 10.1016/j.specom.2012.04.006_b0195
– ident: 10.1016/j.specom.2012.04.006_b0025
– ident: 10.1016/j.specom.2012.04.006_b0305
  doi: 10.1109/ICME.2005.1521551
– ident: 10.1016/j.specom.2012.04.006_b0105
  doi: 10.1109/ICASSP.2009.4959521
– volume: 50
  start-page: 371
  year: 1995
  ident: 10.1016/j.specom.2012.04.006_b0150
  article-title: The emotion probe
  publication-title: Amer. Psychol.
  doi: 10.1037/0003-066X.50.5.372
  contributor:
    fullname: Lang
– ident: 10.1016/j.specom.2012.04.006_b0120
  doi: 10.1109/FG.2011.5771357
– ident: 10.1016/j.specom.2012.04.006_b0270
  doi: 10.1007/11821830_23
– ident: 10.1016/j.specom.2012.04.006_b0035
– ident: 10.1016/j.specom.2012.04.006_b0010
– ident: 10.1016/j.specom.2012.04.006_b0140
  doi: 10.21437/Interspeech.2005-380
– ident: 10.1016/j.specom.2012.04.006_b0175
– ident: 10.1016/j.specom.2012.04.006_b0045
– ident: 10.1016/j.specom.2012.04.006_b0225
  doi: 10.1109/ICASSP.2003.1202279
– volume: 24
  start-page: 513
  issue: 5
  year: 1988
  ident: 10.1016/j.specom.2012.04.006_b0210
  article-title: Term-weighting approaches in automatic text retrieval
  publication-title: Inform. Process. Manage.
  doi: 10.1016/0306-4573(88)90021-0
  contributor:
    fullname: Salton
– volume: 14
  start-page: 199
  issue: 3
  year: 2004
  ident: 10.1016/j.specom.2012.04.006_b0230
  article-title: A tutorial on support vector regression
  publication-title: Stat. Comput.
  doi: 10.1023/B:STCO.0000035301.49549.88
  contributor:
    fullname: Smola
– ident: 10.1016/j.specom.2012.04.006_b0160
– volume: 39
  start-page: 1161
  year: 1980
  ident: 10.1016/j.specom.2012.04.006_b0205
  article-title: A circumplex model of affect
  publication-title: J. Pers. Soc. Psychol.
  doi: 10.1037/h0077714
  contributor:
    fullname: Russell
– volume: 52
  start-page: 1238
  issue: 4B
  year: 1972
  ident: 10.1016/j.specom.2012.04.006_b0275
  article-title: Emotions and speech: some acoustical correlates
  publication-title: J. Acoust. Soc. Amer.
  doi: 10.1121/1.1913238
  contributor:
    fullname: Williams
– ident: 10.1016/j.specom.2012.04.006_b0295
  doi: 10.21437/Interspeech.2005-700
– ident: 10.1016/j.specom.2012.04.006_b0005
  doi: 10.21437/ICSLP.2002-559
– ident: 10.1016/j.specom.2012.04.006_b0020
– ident: 10.1016/j.specom.2012.04.006_b0145
  doi: 10.21437/Eurospeech.2003-80
– ident: 10.1016/j.specom.2012.04.006_b0030
– ident: 10.1016/j.specom.2012.04.006_b0055
– ident: 10.1016/j.specom.2012.04.006_b0280
  doi: 10.21437/Interspeech.2008-192
– volume: 70
  start-page: 614
  year: 1996
  ident: 10.1016/j.specom.2012.04.006_b0015
  article-title: Acoustic profiles in vocal emotion expression
  publication-title: J. Pers. Soc. Psychol.
  doi: 10.1037/0022-3514.70.3.614
  contributor:
    fullname: Banse
– volume: 5
  start-page: 513
  issue: 4
  year: 2005
  ident: 10.1016/j.specom.2012.04.006_b0135
  article-title: Affective speech elicited with a computer game
  publication-title: Emotion
  doi: 10.1037/1528-3542.5.4.513
  contributor:
    fullname: Johnstone
– ident: 10.1016/j.specom.2012.04.006_b0250
  doi: 10.21437/Interspeech.2009-583
– volume: 18
  start-page: 32
  issue: 1
  year: 2001
  ident: 10.1016/j.specom.2012.04.006_b0060
  article-title: Emotion recognition in human-computer interaction
  publication-title: IEEE Signal Process. Mag.
  doi: 10.1109/79.911197
  contributor:
    fullname: Cowie
– year: 2002
  ident: 10.1016/j.specom.2012.04.006_b0255
  contributor:
    fullname: Vapnik
– volume: 18
  start-page: 407
  year: 2005
  ident: 10.1016/j.specom.2012.04.006_b0080
  article-title: Challenges in real-life emotion annotation and machine learning based detection
  publication-title: Neural Networks
  doi: 10.1016/j.neunet.2005.03.007
  contributor:
    fullname: Devillers
– ident: 10.1016/j.specom.2012.04.006_b0085
  doi: 10.21437/Interspeech.2005-381
– year: 1874
  ident: 10.1016/j.specom.2012.04.006_b0290
  contributor:
    fullname: Wundt
– ident: 10.1016/j.specom.2012.04.006_b0075
  doi: 10.1109/ICME.2003.1221370
– ident: 10.1016/j.specom.2012.04.006_b0285
  doi: 10.21437/Interspeech.2009-474
– ident: 10.1016/j.specom.2012.04.006_b0065
  doi: 10.21437/ICSLP.1996-462
– ident: 10.1016/j.specom.2012.04.006_b0215
– ident: 10.1016/j.specom.2012.04.006_b0155
– ident: 10.1016/j.specom.2012.04.006_b0165
  doi: 10.21437/Eurospeech.2003-306
– ident: 10.1016/j.specom.2012.04.006_b0300
  doi: 10.21437/Interspeech.2004-327
– volume: 2
  start-page: 92
  year: 2011
  ident: 10.1016/j.specom.2012.04.006_b0180
  article-title: Continuous prediction of spontaneous affect from multiple cues and modalities in valence-arousal space
  publication-title: IEEE Trans. Affect. Comput.
  doi: 10.1109/T-AFFC.2011.9
  contributor:
    fullname: Nicolaou
– ident: 10.1016/j.specom.2012.04.006_b0115
  doi: 10.1109/ICASSP.2007.367262
– volume: 40
  start-page: 5
  year: 2003
  ident: 10.1016/j.specom.2012.04.006_b0050
  article-title: Describing the emotional states that are expressed in speech
  publication-title: Speech Commun.
  doi: 10.1016/S0167-6393(02)00071-7
  contributor:
    fullname: Cowie
– volume: 48
  start-page: 1162
  issue: 9
  year: 2006
  ident: 10.1016/j.specom.2012.04.006_b0265
  article-title: Emotional speech recognition: resources, features, and methods
  publication-title: Speech Commun.
  doi: 10.1016/j.specom.2006.04.003
  contributor:
    fullname: Ververidis
– ident: 10.1016/j.specom.2012.04.006_b0170
  doi: 10.21437/Interspeech.2009-471
– volume: 15
  start-page: 381
  year: 2006
  ident: 10.1016/j.specom.2012.04.006_b0200
  article-title: Spatial presence and emotions during video game playing: does it matter with whom you play?
  publication-title: Presence: Teleoper. Virtual Environ.
  doi: 10.1162/pres.15.4.381
  contributor:
    fullname: Ravaja
– ident: 10.1016/j.specom.2012.04.006_b0090
– ident: 10.1016/j.specom.2012.04.006_b0190
– ident: 10.1016/j.specom.2012.04.006_b0245
  doi: 10.1007/978-3-540-85853-9_15
– volume: 3
  start-page: 7
  year: 2010
  ident: 10.1016/j.specom.2012.04.006_b0100
  article-title: On-line emotion recognition in a 3-d activation-valence-time continuum using acoustic and linguistic cues
  publication-title: J. Multimodal User Interf.
  doi: 10.1007/s12193-009-0032-6
  contributor:
    fullname: Eyben
– ident: 10.1016/j.specom.2012.04.006_b0040
  doi: 10.21437/Interspeech.2008-92
– year: 1975
  ident: 10.1016/j.specom.2012.04.006_b0095
  contributor:
    fullname: Ekman
– volume: 41
  start-page: 603
  year: 2003
  ident: 10.1016/j.specom.2012.04.006_b0185
  publication-title: Speech Commun.
  doi: 10.1016/S0167-6393(03)00099-2
  contributor:
    fullname: Nwe
– volume: 61
  start-page: 81
  year: 1954
  ident: 10.1016/j.specom.2012.04.006_b0220
  publication-title: Psychol. Rev.
  doi: 10.1037/h0054570
  contributor:
    fullname: Schlosberg
– volume: 7
  start-page: 143
  year: 2005
  ident: 10.1016/j.specom.2012.04.006_b0125
  article-title: Affective video content representation and modeling
  publication-title: IEEE Trans. Multimedia
  doi: 10.1109/TMM.2004.840618
  contributor:
    fullname: Hanjalic
– ident: 10.1016/j.specom.2012.04.006_b0235
  doi: 10.21437/ICSLP.2002-557
– ident: 10.1016/j.specom.2012.04.006_b0260
  doi: 10.1109/ICME.2005.1521717
– ident: 10.1016/j.specom.2012.04.006_b0070
SSID ssj0004882
Score 2.2714021
Snippet ► Exploration of the use of self-reported emotion ratings for automatic affect recognition. ► Better recognition performance is obtained with observed emotion...
The differences between self-reported and observed emotion have only marginally been investigated in the context of speech-based automatic emotion recognition....
SourceID proquest
crossref
elsevier
SourceType Aggregation Database
Publisher
StartPage 1049
SubjectTerms Affective computing
Arousal
Audiovisual database
Automatic emotion recognition
Emotion annotation
Emotion database
Emotion elicitation
Emotion perception
Emotional speech
Emotions
Mathematical analysis
Mathematical models
Observers
Ratings
Recognition
Regression
Support Vector Regression
Videogames
Title Speech-based recognition of self-reported and observed emotion in a dimensional space
URI https://dx.doi.org/10.1016/j.specom.2012.04.006
https://search.proquest.com/docview/1081899449
https://search.proquest.com/docview/1082220994
https://search.proquest.com/docview/1536168022
Volume 54
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
link http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwnV1NT9wwELUQXLigdmkpLSBXqriFjR0ndo4IFW0p4gKrcrP8qW6LktXucuiF385MnGwFElTqMc5YsjzjN2N73piQLyxAWJ7nJqsD7HSEg5WuCl8CGKpSsUKpwnVZvlfVZCoubsvbDXI2cGEwrbLH_oTpHVr3LeN-Nsfz2Wx8jQn04F8LxrsqJsgoF-D-wKZPHv6meYCB8qG-N0oP9LkuxwvZjC3y0fFEEMtoVy-5p2dA3Xmf8zdkpw8b6Wka2VuyEZoR2bvsDxuX9JherusjL0dke41rf0bkIFFw6Y9wF80igOzQ0C5-75Lp9TwE9zNDf-bpOqGobWgb6RL6ZOleAX6axtPW4jkufIT0ABCdNdRQj48EpAIfFDDKhXdkev715myS9Y8tZK5kcpXh40HKOi5kiJbnpg61LZiPjufSssi5iUowYYKtlXTRlzWPpYzecOmslKp4TzabtgkfCOXMY1W9WOMdp7VKlc5Ir6ogbMzr6PdJNsyxnqeaGnpINvulk0406kTnQoNO9okcFKGf2IYG2P9Hz8-D3jQsG7wLMU1o75dYF5XBVlOI-lUZiJ4ghBavyJRFxSrkK3_871F-Itv4lRiOB2RztbgPhxDqrOxRZ8tHZOv02_fJ1SNOkv3M
link.rule.ids 315,783,787,4509,24128,27936,27937,45597,45691
linkProvider Elsevier
linkToHtml http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwnV1Lb9QwEB6VcqAXBMurpYCRELewsePEzhFVVAssvbQrerP8FAsoWe1uD73w25nJYxFIFIljnLFkecbfjO2ZzwCveMSwPM9tVkfc6UiPK10XoUQw1KXmhdaF77J8z6rZQn64LC_34GSshaG0ygH7e0zv0HpomQ6zOV0tl9NzSqBH_1pw0bGYqFtwW1J8jEb95sevPA-0UDESfJP4WD_XJXlROWNLBel0JEg82tXf_NMfSN25n9N7cHeIG9nbfmj3YS82E3g8H04bN-w1m-8IkjcTONgB2_UEjvsaXPY5fk92HVF2bGjX3x7A4nwVo_-SkUMLbJdR1DasTWyDfbL-YgF_2iaw1tFBLn7E_gUgtmyYZYFeCegZPhiClI8PYXH67uJklg2vLWS-5Gqb0etB2nkhVUxO5LaOtSt4SF7kyvEkhE1acmmjq7XyKZS1SKVKwQrlnVK6eAT7TdvEJ8AED0Srl2q65HRO69JbFXQVpUt5ncIhZOMcm1VPqmHGbLOvpteJIZ2YXBrUySGoURHmN-MwiPv_6Ply1JvBdUOXIbaJ7dWGiFE57jWlrG-UwfAJY2h5g0xZVLyiguWj_x7lC7gzu_g0N_P3Zx-fwgH96csdj2F_u76KzzDu2brnnV3_BC3F_2U
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Speech-based+recognition+of+self-reported+and+observed+emotion+in+a+dimensional+space&rft.jtitle=Speech+communication&rft.au=Truong%2C+Khiet+P&rft.au=van+Leeuwen%2C+David+A&rft.au=de+Jong%2C+Franciska+MG&rft.date=2012-11-01&rft.issn=0167-6393&rft.volume=54&rft.issue=9&rft.spage=1049&rft.epage=1063&rft_id=info:doi/10.1016%2Fj.specom.2012.04.006&rft.externalDBID=NO_FULL_TEXT
thumbnail_l http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=0167-6393&client=summon
thumbnail_m http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=0167-6393&client=summon
thumbnail_s http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=0167-6393&client=summon