Sparse feature learning for instrument identification: Effects of sampling and pooling methods

Feature learning for music applications has recently received considerable attention from many researchers. This paper reports on the sparse feature learning algorithm for musical instrument identification, and in particular, focuses on the effects of the frame sampling techniques for dictionary lea...

Full description

Saved in:
Bibliographic Details
Published inThe Journal of the Acoustical Society of America Vol. 139; no. 5; p. 2290
Main Authors Han, Yoonchang, Lee, Subin, Nam, Juhan, Lee, Kyogu
Format Journal Article
LanguageEnglish
Published United States 01.05.2016
Online AccessGet more information

Cover

Loading…
More Information
Summary:Feature learning for music applications has recently received considerable attention from many researchers. This paper reports on the sparse feature learning algorithm for musical instrument identification, and in particular, focuses on the effects of the frame sampling techniques for dictionary learning and the pooling methods for feature aggregation. To this end, two frame sampling techniques are examined that are fixed and proportional random sampling. Furthermore, the effect of using onset frame was analyzed for both of proposed sampling methods. Regarding summarization of the feature activation, a standard deviation pooling method is used and compared with the commonly used max- and average-pooling techniques. Using more than 47 000 recordings of 24 instruments from various performers, playing styles, and dynamics, a number of tuning parameters are experimented including the analysis frame size, the dictionary size, and the type of frequency scaling as well as the different sampling and pooling methods. The results show that the combination of proportional sampling and standard deviation pooling achieve the best overall performance of 95.62% while the optimal parameter set varies among the instrument classes.
ISSN:1520-8524
DOI:10.1121/1.4946988