Prediction of the Distribution of Perceived Music Emotions Using Discrete Samples

Typically, a machine learning model of automatic music emotion recognition is trained to learn the relationship between music features and perceived emotion values. However, simply assigning an emotion value to a clip in the training phase does not work well because the perceived emotion of a clip v...

Full description

Saved in:

Bibliographic Details
Published in	IEEE transactions on audio, speech, and language processing Vol. 19; no. 7; pp. 2184 - 2196
Main Authors	Yi-Hsuan Yang, Chen, H. H.
Format	Journal Article
Language	English
Published	Piscataway, NJ IEEE 01.09.2011 Institute of Electrical and Electronics Engineers
Subjects	Accuracy Applied sciences Arousal Clips Computational modeling emotion distribution prediction Emotions Exact sciences and technology Feature extraction Information, signal and communications theory Kernel Machine learning Mathematical models Miscellaneous Music music emotion recognition Planes Predictive models regression Samples Signal processing Statistical analysis Statistical methods subjectivity Telecommunications and information theory Training valence Performance evaluation emotion distribution prediction Emotion recognition subjectivity Probabilistic approach Probability distribution valence Algorithm Modeling Sound recognition Learning Calling line identification presentation regression Arousal Automatic recognition Musical sound music emotion recognition
Online Access	Get full text

Cover

Loading…

More Information
Summary:	Typically, a machine learning model of automatic music emotion recognition is trained to learn the relationship between music features and perceived emotion values. However, simply assigning an emotion value to a clip in the training phase does not work well because the perceived emotion of a clip varies from person to person. To resolve this problem, we propose a novel approach that represents the perceived emotion of a clip as a probability distribution in the emotion plane. In addition, we develop a methodology that predicts the emotion distribution of a clip by estimating the emotion mass at discrete samples of the emotion plane. We also develop model fusion algorithms to integrate different perceptual dimensions of music listening and to enhance the modeling of emotion perception. The effectiveness of the proposed approach is validated through an extensive performance study. An average R 2 statistics of 0.5439 for emotion prediction is achieved. We also show how this approach can be applied to enhance our understanding of music emotion.
Bibliography:	ObjectType-Article-2 SourceType-Scholarly Journals-1 ObjectType-Feature-1 content type line 23
ISSN:	1558-7916 1558-7924
DOI:	10.1109/TASL.2011.2118752