An effective method for audio-to-score alignment using onsets and modified constant Q spectra

This paper proposes an effective algorithm for polyphonic audio-to-score alignment that aligns a polyphonic music performance to its corresponding score. The proposed framework consists of three steps: onset detection, note matching, and dynamic programming. In the first step, onsets are detected an...

Full description

Saved in:

Bibliographic Details
Published in	Multimedia tools and applications Vol. 78; no. 2; pp. 2017 - 2044
Main Authors	Chen, Chunta, Jang, Jyh-Shing Roger
Format	Journal Article
Language	English
Published	New York Springer US 2019 Springer Nature B.V
Subjects	Algorithms Alignment Computer Communication Networks Computer Science Data Structures and Information Theory Dynamic programming Feature extraction Matching Mathematical analysis Matrix methods Multimedia Information Systems Music Similarity Special Purpose and Application-Based Systems Score following Audio-to-score alignment Music synchronization Audio onset detection
Online Access	Get full text

Cover

Loading…

More Information
Summary:	This paper proposes an effective algorithm for polyphonic audio-to-score alignment that aligns a polyphonic music performance to its corresponding score. The proposed framework consists of three steps: onset detection, note matching, and dynamic programming. In the first step, onsets are detected and then onset features are extracted by applying the constant Q transform around each onset. A similarity matrix is computed using a note-matching function to evaluate the similarity between concurrent notes in the music score and onsets in the audio recording. Finally, dynamic programming is used to extract the optimal alignment path in the similarity matrix. We compared five onset detectors and three spectrum difference vectors at selected audio onsets. The experimental results revealed that our method achieved higher precision than did the other algorithms included for comparison. This paper also proposes an online approach based on onset detection that can detect most notes within only 10 ms. Based on our experimental results, this online approach outperforms all methods included for comparison when the tolerance window is 50 ms.
ISSN:	1380-7501 1573-7721
DOI:	10.1007/s11042-018-6349-y