Improving Music Genre Classification from Multi-Modal Properties of Music and Genre Correlations Perspective
Music genre classification has been widely studied in past few years for its various applications in music information retrieval. Previous works tend to perform unsatisfactorily, since those methods only use audio content or jointly use audio content and lyrics content inefficiently. In addition, as...
Saved in:
Main Authors | , , , , |
---|---|
Format | Journal Article |
Language | English |
Published |
14.03.2023
|
Subjects | |
Online Access | Get full text |
Cover
Loading…
Summary: | Music genre classification has been widely studied in past few years for its
various applications in music information retrieval. Previous works tend to
perform unsatisfactorily, since those methods only use audio content or jointly
use audio content and lyrics content inefficiently. In addition, as genres
normally co-occur in a music track, it is desirable to capture and model the
genre correlations to improve the performance of multi-label music genre
classification. To solve these issues, we present a novel multi-modal method
leveraging audio-lyrics contrastive loss and two symmetric cross-modal
attention, to align and fuse features from audio and lyrics. Furthermore, based
on the nature of the multi-label classification, a genre correlations
extraction module is presented to capture and model potential genre
correlations. Extensive experiments demonstrate that our proposed method
significantly surpasses other multi-label music genre classification methods
and achieves state-of-the-art result on Music4All dataset. |
---|---|
DOI: | 10.48550/arxiv.2303.07667 |