On the Preparation and Validation of a Large-Scale Dataset of Singing Transcription

This paper proposes a large-scale dataset for singing transcription, along with some methods for fine-tuning and validating its contents. The dataset is named MIR-ST500, which consists of more than 160,000 notes from 500 pop songs. To create this large-scale dataset, we set some labeling criteria an...

Full description

Saved in:
Bibliographic Details
Published inICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) pp. 276 - 280
Main Authors Wang, Jun-You, Jang, Jyh-Shing Roger
Format Conference Proceeding
LanguageEnglish
Published IEEE 06.06.2021
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:This paper proposes a large-scale dataset for singing transcription, along with some methods for fine-tuning and validating its contents. The dataset is named MIR-ST500, which consists of more than 160,000 notes from 500 pop songs. To create this large-scale dataset, we set some labeling criteria and ask non-experts to label notes. We also perform some adjustments on the annotation to correct minor errors. Finally, to validate the dataset, we train a singing transcription model on MIR-ST500 dataset and evaluate it on various datasets. The result shows that we can certainly construct a better singing transcription model for various purposes using MIR-ST500, which is properly labeled and validated.
ISSN:2379-190X
DOI:10.1109/ICASSP39728.2021.9414601