The ACII 2022 Affective Vocal Bursts Workshop & Competition

The ACII Affective Vocal Bursts Workshop & Com-petition is focused on understanding multiple affective dimensions of vocal bursts: laughs, gasps, cries, screams, and many other non-linguistic vocalizations central to the expression of emotion and to human communication more generally. This year&...

Full description

Saved in:
Bibliographic Details
Published in2022 10th International Conference on Affective Computing and Intelligent Interaction Workshops and Demos (ACIIW) pp. 1 - 5
Main Authors Baird, Alice, Tzirakis, Panagiotis, Brooks, Jeffrey A., Gregory, Chris B., Schuller, Bjorn, Batliner, Anton, Keltner, Dacher, Cowen, Alan
Format Conference Proceeding
LanguageEnglish
Published IEEE 18.10.2022
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:The ACII Affective Vocal Bursts Workshop & Com-petition is focused on understanding multiple affective dimensions of vocal bursts: laughs, gasps, cries, screams, and many other non-linguistic vocalizations central to the expression of emotion and to human communication more generally. This year's com-petition comprises four tracks using a large-scale and in-the-wild dataset of 59,299 vocalizations from 1,702 speakers. The first, the A- VB-HIGH task, requires competition participants to perform a multi-label regression on a novel model for emotion, utilizing ten classes of richly annotated emotional expression intensities, including; Awe, Fear, and Surprise. The second, the A-VB-Two task, utilizes the more conventional 2-dimensional model for emotion, arousal, and valence. The third, the A- VB-CULTURE task, requires participants to explore the cultural aspects of the dataset, training native-country dependent models. Finally, for the fourth task, A - VB - TYPE, participants should recognize the type of vocal burst (e.g., laughter, cry, grunt) as an 8-class classification. This paper describes the four tracks and baseline systems, which use state-of-the-art machine learning methods. The baseline performance for each track is obtained by utilizing an end-to-end deep learning model and is as follows: for A- VB-HIGH, a mean (over the 10-dimensions) Concordance Correlation Coefficient (CCC) of 0.5687 CCC is obtained; for A- VB- Two, a mean (over the 2-dimensions) CCC of 0.5084 is obtained; for A-VB − CULTURE, a mean CCC from the four cultures of 0.4401 is obtained; and for A-VB-TYPE, the baseline Unweighted Average Recall (UAR) from the 8-classes is 0.4172 UAR.
DOI:10.1109/ACIIW57231.2022.10086002