Polyphonic Piano Music Transcription using Long Short-Term Memory

Music Transcription is considered as a challenging task even for a human expert. It is an important part of the Music Information Retrieval. Earlier Automatic Music Transcription systems involved spectrogram decomposition techniques like Non-Negative Matrix Factorisation. With the evolution of high-...

Full description

Saved in:
Bibliographic Details
Published in2019 10th International Conference on Computing, Communication and Networking Technologies (ICCCNT) pp. 1 - 7
Main Authors Sadekar, Aakash, Mahajan, Shrinivas P
Format Conference Proceeding
LanguageEnglish
Published IEEE 01.07.2019
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:Music Transcription is considered as a challenging task even for a human expert. It is an important part of the Music Information Retrieval. Earlier Automatic Music Transcription systems involved spectrogram decomposition techniques like Non-Negative Matrix Factorisation. With the evolution of high-speed computing, and availability of large dataset, the current state of the art uses deep learning based techniques like Convolutional Neural network or Recurrent neural network for music transcription. RNN is good for temporal data with long term dependencies. However, they have vanishing and exploding gradient problem. While there exists an automatic music transcription system for monophonic music, we endeavour to develop a polyphonic transcription model. In this paper, a novel model is proposed based on Long short term memory, a variant of RNN, but without the gradient problem. LSTM selectively remembers the past inputs and passes on the relevant information. The model is designed for piano music and gives one of the 88 keys as an output. One hot encoding is used to map the output with the pitches. It is observed that the Short-time Fourier transform requires a lot of memory, so Constant Q Transform is used instead. Mel scaled spectrograms give better results than linearly scaled spectrograms. The model is trained on the MAESTRO database and tested on the standard MAPS database.
DOI:10.1109/ICCCNT45670.2019.8944400