Informed monaural source separation of music based on convolutional sparse coding

Monaural source separation is a challenging problem that has many important applications in music information retrieval. In this paper, we focus on the score-informed variant of this problem. While non-negative matrix factorization and some other approaches have been shown effective, few existing ap...

Full description

Saved in:

Bibliographic Details
Published in	2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) pp. 236 - 240
Main Authors	Ping-Keng Jao, Yi-Hsuan Yang, Wohlberg, Brendt
Format	Conference Proceeding
Language	English
Published	IEEE 01.04.2015
Subjects	Convolution Convolutional codes Convolutional sparse coding Dictionaries dictionary learning Instruments score-informed monaural source separation Source separation Speech Time-domain analysis
Online Access	Get full text

Cover

Loading…

More Information
Summary:	Monaural source separation is a challenging problem that has many important applications in music information retrieval. In this paper, we focus on the score-informed variant of this problem. While non-negative matrix factorization and some other approaches have been shown effective, few existing approaches have properly taken the phase information into account. There are unnatural sound in the separation result, as the phase of each source signal is considered equivalent to the phase of the mixed signal. To remedy this, we propose to perform source separation directly in the time domain using a convolutional sparse coding (CSC) approach. Evaluation on the Bach10 dataset shows that, when the instrument, pitch and onset/offset time are informed, the source to distortion ratio of the separation result reaches 8.59 dB, which is 2.02 dB higher than a state-of-the-art system called Soundprism.
ISSN:	1520-6149 2379-190X
DOI:	10.1109/ICASSP.2015.7177967