Evaluating Deep Music Generation Methods Using Data Augmentation

Despite advances in deep algorithmic music generation, evaluation of generated samples often relies on human evaluation, which is subjective and costly. We focus on designing a homogeneous, objective framework for evaluating samples of algorithmically generated music. Any engineered measures to eval...

Full description

Saved in:
Bibliographic Details
Published in2021 IEEE 23rd International Workshop on Multimedia Signal Processing (MMSP) pp. 1 - 6
Main Authors Godwin, Toby, Rizos, Georgios, Baird, Alice, Al Futaisi, Najla D., Brisse, Vincent, Schuller, Bjorn W.
Format Conference Proceeding
LanguageEnglish
Published IEEE 06.10.2021
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:Despite advances in deep algorithmic music generation, evaluation of generated samples often relies on human evaluation, which is subjective and costly. We focus on designing a homogeneous, objective framework for evaluating samples of algorithmically generated music. Any engineered measures to evaluate generated music typically attempt to define the samples' musicality, but do not capture qualities of music such as theme or mood. We do not seek to assess the musical merit of generated music, but instead explore whether generated samples contain meaningful information pertaining to emotion or mood/theme. We achieve this by measuring the change in predictive performance of a music mood/theme classifier after augmenting its training data with generated samples. We analyse music samples generated by three models - SampleRNN, Jukebox, and DDSP - and employ a homogeneous framework across all methods to allow for objective comparison. This is the first attempt at augmenting a music genre classification dataset with conditionally generated music. We investigate the classification performance improvement using deep music generation and the ability of the generators to make emotional music by using an additional, emotion annotation of the dataset. Finally, we use a classifier trained on real data to evaluate the label validity of class-conditionally generated samples.
ISSN:2473-3628
DOI:10.1109/MMSP53017.2021.9733502