Toward Universal Text-To-Music Retrieval

This paper introduces effective design choices for text-to-music retrieval systems. An ideal text-based retrieval system would support various input queries such as pre-defined tags, unseen tags, and sentence-level descriptions. In reality, most previous works mainly focused on a single query type (...

Full description

Saved in:
Bibliographic Details
Published inICASSP 2023 - 2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) pp. 1 - 5
Main Authors Doh, SeungHeon, Won, Minz, Choi, Keunwoo, Nam, Juhan
Format Conference Proceeding
LanguageEnglish
Published IEEE 04.06.2023
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:This paper introduces effective design choices for text-to-music retrieval systems. An ideal text-based retrieval system would support various input queries such as pre-defined tags, unseen tags, and sentence-level descriptions. In reality, most previous works mainly focused on a single query type (tag or sentence) which may not generalize to another input type. Hence, we review recent text-based music retrieval systems using our proposed benchmark in two main aspects: input text representation and training objectives. Our findings enable a universal text-to-music retrieval system that achieves comparable retrieval performances in both tag- and sentence-level inputs. Furthermore, the proposed multimodal representation generalizes to 9 different downstream music classification tasks. We present the code and demo online. 1
ISSN:2379-190X
DOI:10.1109/ICASSP49357.2023.10094670