The ACII 2022 Affective Vocal Bursts Workshop & Competition

The ACII Affective Vocal Bursts Workshop & Com-petition is focused on understanding multiple affective dimensions of vocal bursts: laughs, gasps, cries, screams, and many other non-linguistic vocalizations central to the expression of emotion and to human communication more generally. This year&...

Full description

Saved in:

Bibliographic Details
Published in	2022 10th International Conference on Affective Computing and Intelligent Interaction Workshops and Demos (ACIIW) pp. 1 - 5
Main Authors	Baird, Alice, Tzirakis, Panagiotis, Brooks, Jeffrey A., Gregory, Chris B., Schuller, Bjorn, Batliner, Anton, Keltner, Dacher, Cowen, Alan
Format	Conference Proceeding
Language	English
Published	IEEE 18.10.2022
Subjects	Affective computing Conferences Correlation coefficient Cultural aspects Deep learning emotional expression machine learning multi-label Task analysis Training vocal bursts
Online Access	Get full text

Cover

Loading…

More Information
Summary:	The ACII Affective Vocal Bursts Workshop & Com-petition is focused on understanding multiple affective dimensions of vocal bursts: laughs, gasps, cries, screams, and many other non-linguistic vocalizations central to the expression of emotion and to human communication more generally. This year's com-petition comprises four tracks using a large-scale and in-the-wild dataset of 59,299 vocalizations from 1,702 speakers. The first, the A- VB-HIGH task, requires competition participants to perform a multi-label regression on a novel model for emotion, utilizing ten classes of richly annotated emotional expression intensities, including; Awe, Fear, and Surprise. The second, the A-VB-Two task, utilizes the more conventional 2-dimensional model for emotion, arousal, and valence. The third, the A- VB-CULTURE task, requires participants to explore the cultural aspects of the dataset, training native-country dependent models. Finally, for the fourth task, A - VB - TYPE, participants should recognize the type of vocal burst (e.g., laughter, cry, grunt) as an 8-class classification. This paper describes the four tracks and baseline systems, which use state-of-the-art machine learning methods. The baseline performance for each track is obtained by utilizing an end-to-end deep learning model and is as follows: for A- VB-HIGH, a mean (over the 10-dimensions) Concordance Correlation Coefficient (CCC) of 0.5687 CCC is obtained; for A- VB- Two, a mean (over the 2-dimensions) CCC of 0.5084 is obtained; for A-VB − CULTURE, a mean CCC from the four cultures of 0.4401 is obtained; and for A-VB-TYPE, the baseline Unweighted Average Recall (UAR) from the 8-classes is 0.4172 UAR.
DOI:	10.1109/ACIIW57231.2022.10086002