Towards Robust Local Key Estimation with a Musically Inspired Neural Network
Local key estimation from music audio recordings is a challenging task. Due to its complexity and inherent ambiguity, machine-learning methods often overfit to specific pieces and their annotations, therefore lacking robustness and generalizability. Based on a previous case study on the Schubert Win...
Saved in:
Published in | 2024 32nd European Signal Processing Conference (EUSIPCO) pp. 26 - 30 |
---|---|
Main Authors | , |
Format | Conference Proceeding |
Language | English |
Published |
European Association for Signal Processing - EURASIP
26.08.2024
|
Subjects | |
Online Access | Get full text |
Cover
Loading…
Summary: | Local key estimation from music audio recordings is a challenging task. Due to its complexity and inherent ambiguity, machine-learning methods often overfit to specific pieces and their annotations, therefore lacking robustness and generalizability. Based on a previous case study on the Schubert Winterreise dataset, this paper aims to build a robust local key estimation methods. To this end, we propose a novel neural network architecture (OctaveNet), which is inspired by the musical relationship of frequency bins in the constant-Q transform (CQT) and the ability of recurrent layers to process sequential data. OctaveNet rearranges the CQT spectrogram in two different ways, processes each of the branches with convolutional and recurrent layers, and finally fuses the two feature maps to predict the local key. Our results show that, while having fewer parameters, OctaveNet achieves a substantial improvement over previous methods, especially for unseen songs, which indicates its stronger generalizability. |
---|---|
ISSN: | 2076-1465 |