Audio features dedicated to the detection of arousal and valence in music recordings

The aim of this paper was to discover what combination of audio features gives the best performance with music emotion detection. In our approach, emotion recognition was treated as a regression problem and a two-dimensional valence-arousal model was used to measure emotions in music. We used featur...

Full description

Saved in:
Bibliographic Details
Published in2017 IEEE International Conference on INnovations in Intelligent SysTems and Applications (INISTA) pp. 40 - 44
Main Author Grekow, Jacek
Format Conference Proceeding
LanguageEnglish
Published IEEE 01.07.2017
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:The aim of this paper was to discover what combination of audio features gives the best performance with music emotion detection. In our approach, emotion recognition was treated as a regression problem and a two-dimensional valence-arousal model was used to measure emotions in music. We used features extracted by Essentia and Marsyas, tools for audio analysis and audio-based music information retrieval. We examined the influence of different feature sets - low-level, rhythm, tonal, and their combination - on arousal and valence prediction. The use of a combination of different types of features significantly improves the results compared with using just one group of features. We found and presented features particularly dedicated to the detection of arousal and valence separately, as well as features useful in both cases.
DOI:10.1109/INISTA.2017.8001129