Generating High-quality Symbolic Music Using Fine-grained Discriminators
Existing symbolic music generation methods usually utilize discriminator to improve the quality of generated music via global perception of music. However, considering the complexity of information in music, such as rhythm and melody, a single discriminator cannot fully reflect the differences in th...
Saved in:
Main Authors | , , , , , , , |
---|---|
Format | Journal Article |
Language | English |
Published |
03.08.2024
|
Subjects | |
Online Access | Get full text |
Cover
Loading…
Summary: | Existing symbolic music generation methods usually utilize discriminator to
improve the quality of generated music via global perception of music. However,
considering the complexity of information in music, such as rhythm and melody,
a single discriminator cannot fully reflect the differences in these two
primary dimensions of music. In this work, we propose to decouple the melody
and rhythm from music, and design corresponding fine-grained discriminators to
tackle the aforementioned issues. Specifically, equipped with a pitch
augmentation strategy, the melody discriminator discerns the melody variations
presented by the generated samples. By contrast, the rhythm discriminator,
enhanced with bar-level relative positional encoding, focuses on the velocity
of generated notes. Such a design allows the generator to be more explicitly
aware of which aspects should be adjusted in the generated music, making it
easier to mimic human-composed music. Experimental results on the POP909
benchmark demonstrate the favorable performance of the proposed method compared
to several state-of-the-art methods in terms of both objective and subjective
metrics. |
---|---|
DOI: | 10.48550/arxiv.2408.01696 |