Analyzing the dictionary properties and sparsity constraints for a dictionary-based music genre classification system

Learning dictionaries from a large-scale music database is a burgeoning research topic in the music information retrieval (MIR) community. It has been shown that classification systems based on such learned features exhibit state-of-the-art accuracy in many music classification benchmarks. Although...

Full description

Saved in:
Bibliographic Details
Published in2013 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference pp. 1 - 8
Main Authors Ping-Keng Jao, Li Su, Yi-Hsuan Yang
Format Conference Proceeding
LanguageEnglish
Published APSIPA 01.10.2013
Subjects
Online AccessGet full text

Cover

Loading…
Abstract Learning dictionaries from a large-scale music database is a burgeoning research topic in the music information retrieval (MIR) community. It has been shown that classification systems based on such learned features exhibit state-of-the-art accuracy in many music classification benchmarks. Although the general approach of dictionary-based MIR has been shown effective, little work has been done to investigate the relationship between system performance and dictionary properties, such as the dictionary sparsity, coherence, and conditional number of the dictionary. This paper aims at addressing this issue by systematically evaluating the performance of three types of dictionary learning algorithms for the task of genre classification, including the least-square based RLS (recursive least square) algorithm, and two variants of the stochastic gradient descent-based algorithm ODL (online dictionary learning) with different regularization functions. Specifically, we learn the dictionary with the USPOP2002 dataset and perform genre classification with the GTZAN dataset. Our result shows that setting strict sparsity constraints in the RLS-based dictionary learning (i.e., <;1% of the signal dimension) leads to better accuracy in genre classification (around 80% when linear kernel support vector classifier is adopted). Moreover, we find that different sparsity constraints are needed for the dictionary learning phase and the encoding phase. Important links between dictionary properties and classification accuracy are also identified, such as a strong correlation between reconstruction error and classification accuracy in all algorithms. These findings help the design of future dictionary-based MIR systems and the selection of important dictionary learning parameters.
AbstractList Learning dictionaries from a large-scale music database is a burgeoning research topic in the music information retrieval (MIR) community. It has been shown that classification systems based on such learned features exhibit state-of-the-art accuracy in many music classification benchmarks. Although the general approach of dictionary-based MIR has been shown effective, little work has been done to investigate the relationship between system performance and dictionary properties, such as the dictionary sparsity, coherence, and conditional number of the dictionary. This paper aims at addressing this issue by systematically evaluating the performance of three types of dictionary learning algorithms for the task of genre classification, including the least-square based RLS (recursive least square) algorithm, and two variants of the stochastic gradient descent-based algorithm ODL (online dictionary learning) with different regularization functions. Specifically, we learn the dictionary with the USPOP2002 dataset and perform genre classification with the GTZAN dataset. Our result shows that setting strict sparsity constraints in the RLS-based dictionary learning (i.e., <;1% of the signal dimension) leads to better accuracy in genre classification (around 80% when linear kernel support vector classifier is adopted). Moreover, we find that different sparsity constraints are needed for the dictionary learning phase and the encoding phase. Important links between dictionary properties and classification accuracy are also identified, such as a strong correlation between reconstruction error and classification accuracy in all algorithms. These findings help the design of future dictionary-based MIR systems and the selection of important dictionary learning parameters.
Author Yi-Hsuan Yang
Li Su
Ping-Keng Jao
Author_xml – sequence: 1
  surname: Ping-Keng Jao
  fullname: Ping-Keng Jao
  email: nafraw@citi.sinica.edu.tw
  organization: Res. Center for Inf. Technol. Innovation, Acad. Sinica, Taipei, Taiwan
– sequence: 2
  surname: Li Su
  fullname: Li Su
  email: lisu@citi.sinica.edu.tw
  organization: Res. Center for Inf. Technol. Innovation, Acad. Sinica, Taipei, Taiwan
– sequence: 3
  surname: Yi-Hsuan Yang
  fullname: Yi-Hsuan Yang
  email: yang@citi.sinica.edu.tw
  organization: Res. Center for Inf. Technol. Innovation, Acad. Sinica, Taipei, Taiwan
BookMark eNpNkMtqwzAURFVoF22aL8hGP2D3On5IWprQRyCQQLMP19JVKrBlIykL9-ub0iy6ms3MYWae2L0fPTG2KiAvClAv7eFze2jzNRRl3jSqWgt5x5ZKSCUbBQANVI_s0nrs52_nzzx9ETdOJzd6DDOfwjhRSI4iR294nDBEl2auRx9TQOdT5HYMHP-Fsg4jGT5cotP8TD4Q1z3G6KzT-OvhcY6Jhmf2YLGPtLzpgh3fXo-bj2y3f99u2l3mFKSsI6uI6toINNIKUZmirEAaVelaCtKlwevQ2hi0HUApa4GA0FhlEUyju3LBVn9YR0SnKbjh2vF0-6L8AUx-XVY
ContentType Conference Proceeding
DBID 6IE
6IL
CBEJK
RIE
RIL
DOI 10.1109/APSIPA.2013.6694278
DatabaseName IEEE Electronic Library (IEL) Conference Proceedings
IEEE Proceedings Order Plan All Online (POP All Online) 1998-present by volume
IEEE Xplore All Conference Proceedings
IEEE Electronic Library (IEL)
IEEE Proceedings Order Plans (POP All) 1998-Present
DatabaseTitleList
Database_xml – sequence: 1
  dbid: RIE
  name: IEEE Electronic Library (IEL)
  url: https://proxy.k.utb.cz/login?url=https://ieeexplore.ieee.org/
  sourceTypes: Publisher
DeliveryMethod fulltext_linktorsrc
EISBN 9789869000604
9869000606
EndPage 8
ExternalDocumentID 6694278
Genre orig-research
GroupedDBID 6IE
6IL
CBEJK
RIE
RIL
ID FETCH-LOGICAL-i90t-bef9ee55d7ad8f774d13408d94c587ec3da1105ddafb003857a0a06f9fa0d6cb3
IEDL.DBID RIE
IngestDate Thu Jun 29 18:35:27 EDT 2023
IsPeerReviewed false
IsScholarly false
Language English
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-i90t-bef9ee55d7ad8f774d13408d94c587ec3da1105ddafb003857a0a06f9fa0d6cb3
PageCount 8
ParticipantIDs ieee_primary_6694278
PublicationCentury 2000
PublicationDate 2013-Oct.
PublicationDateYYYYMMDD 2013-10-01
PublicationDate_xml – month: 10
  year: 2013
  text: 2013-Oct.
PublicationDecade 2010
PublicationTitle 2013 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference
PublicationTitleAbbrev APSIPA
PublicationYear 2013
Publisher APSIPA
Publisher_xml – name: APSIPA
Score 1.5699614
Snippet Learning dictionaries from a large-scale music database is a burgeoning research topic in the music information retrieval (MIR) community. It has been shown...
SourceID ieee
SourceType Publisher
StartPage 1
SubjectTerms Accuracy
Algorithm design and analysis
Classification algorithms
Dictionaries
Encoding
Kernel
Support vector machines
Title Analyzing the dictionary properties and sparsity constraints for a dictionary-based music genre classification system
URI https://ieeexplore.ieee.org/document/6694278
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
link http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV1LSwMxEA61J08qrfgmB49mm-4jTY5FLFWoFKzQW5k8Foq4LXX3YH-9mexaH3jwtoRdssws-b5NvvmGkGtpnAJQmnnyAAwZNIM0tSyPjYcbaTiEcrHJoxg_pw_zbN4iN7taGOdcEJ-5CC_DWb5dmQq3ynpCKOwMsUf2JI_rWq3GSKjPVW84fbqfDlGtlUTNnT9apgTEGB2QyedctVDkJapKHZntLxvG_77MIel-1ebR6Q51jkjLFR1SBW-RrR-gntBRuwzVCrB5p2vcbN-gayqFwlK_fgQVBjXIC7E9RPlGPW-l8O0hhtBm6Su2gKbhA6MGSTaqikIiae3_3CWz0d3sdsyahgpsqXjJtMuVc1lmB2Bl7nmf7Scpl1alJpMDZxILPpCZtZDrcGI4AA5c5CoHboXRyTFpF6vCnRCqssTF0oDz_1upELHmfeOBH4xJtASRnZIORmyxri0zFk2wzv4ePif7mLVaI3dB2uWmcpce60t9FZL8AeyMsBY
link.rule.ids 310,311,786,790,795,796,802,27956,55107
linkProvider IEEE
linkToHtml http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV3PT8IwFG4QD3pSA8bf9uDRjsK2rj0SIwEFQiIm3MjrjyXEOAhuB_nrbbuJP-LB29Js6fLe0u9b-73vIXTDlREAQhJLHoA4Bk0gijRJO8rCDVcUfLnYaMz6z9HDLJ7V0O22FsYY48VnJnCX_ixfL1XhtspajAnXGWIH7Vqcp0lZrVVZCbWpaHUnT4NJ1-m1wqC690fTFI8ZvQM0-pytlIq8BEUuA7X5ZcT439c5RM2v6jw82eLOEaqZrIEK7y6ysQPYUjqsF75eAdbveOW229fONxVDprFdQbwOAyvHDF2DiPwNW-aK4dtDxIGbxq-uCTT2nxhWjmY7XZFPJS4doJto2ruf3vVJ1VKBLATNiTSpMCaOdQKap5b56XYYUa5FpGKeGBVqsIGMtYZU-jPDBChQlooUqGZKhseoni0zc4KwiEPT4QqM_eOKGOtI2lYW-kGpUHJg8SlquIjNV6VpxrwK1tnfw9dorz8dDefDwfjxHO27DJaKuQtUz9eFubTIn8srn_APWVuzag
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=proceeding&rft.title=2013+Asia-Pacific+Signal+and+Information+Processing+Association+Annual+Summit+and+Conference&rft.atitle=Analyzing+the+dictionary+properties+and+sparsity+constraints+for+a+dictionary-based+music+genre+classification+system&rft.au=Ping-Keng+Jao&rft.au=Li+Su&rft.au=Yi-Hsuan+Yang&rft.date=2013-10-01&rft.pub=APSIPA&rft.spage=1&rft.epage=8&rft_id=info:doi/10.1109%2FAPSIPA.2013.6694278&rft.externalDocID=6694278