On the Preparation and Validation of a Large-Scale Dataset of Singing Transcription

This paper proposes a large-scale dataset for singing transcription, along with some methods for fine-tuning and validating its contents. The dataset is named MIR-ST500, which consists of more than 160,000 notes from 500 pop songs. To create this large-scale dataset, we set some labeling criteria an...

Full description

Saved in:

Bibliographic Details
Published in	ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) pp. 276 - 280
Main Authors	Wang, Jun-You, Jang, Jyh-Shing Roger
Format	Conference Proceeding
Language	English
Published	IEEE 06.06.2021
Subjects	Acoustics Annotations Automatic singing transcription Conferences dataset preparation dataset validation Information retrieval Labeling music information retrieval Reliability Signal processing
Online Access	Get full text

Cover

Loading…

More Information
Summary:	This paper proposes a large-scale dataset for singing transcription, along with some methods for fine-tuning and validating its contents. The dataset is named MIR-ST500, which consists of more than 160,000 notes from 500 pop songs. To create this large-scale dataset, we set some labeling criteria and ask non-experts to label notes. We also perform some adjustments on the annotation to correct minor errors. Finally, to validate the dataset, we train a singing transcription model on MIR-ST500 dataset and evaluate it on various datasets. The result shows that we can certainly construct a better singing transcription model for various purposes using MIR-ST500, which is properly labeled and validated.
ISSN:	2379-190X
DOI:	10.1109/ICASSP39728.2021.9414601