Analyzing the dictionary properties and sparsity constraints for a dictionary-based music genre classification system
Learning dictionaries from a large-scale music database is a burgeoning research topic in the music information retrieval (MIR) community. It has been shown that classification systems based on such learned features exhibit state-of-the-art accuracy in many music classification benchmarks. Although...
Saved in:
Published in | 2013 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference pp. 1 - 8 |
---|---|
Main Authors | , , |
Format | Conference Proceeding |
Language | English |
Published |
APSIPA
01.10.2013
|
Subjects | |
Online Access | Get full text |
Cover
Loading…
Abstract | Learning dictionaries from a large-scale music database is a burgeoning research topic in the music information retrieval (MIR) community. It has been shown that classification systems based on such learned features exhibit state-of-the-art accuracy in many music classification benchmarks. Although the general approach of dictionary-based MIR has been shown effective, little work has been done to investigate the relationship between system performance and dictionary properties, such as the dictionary sparsity, coherence, and conditional number of the dictionary. This paper aims at addressing this issue by systematically evaluating the performance of three types of dictionary learning algorithms for the task of genre classification, including the least-square based RLS (recursive least square) algorithm, and two variants of the stochastic gradient descent-based algorithm ODL (online dictionary learning) with different regularization functions. Specifically, we learn the dictionary with the USPOP2002 dataset and perform genre classification with the GTZAN dataset. Our result shows that setting strict sparsity constraints in the RLS-based dictionary learning (i.e., <;1% of the signal dimension) leads to better accuracy in genre classification (around 80% when linear kernel support vector classifier is adopted). Moreover, we find that different sparsity constraints are needed for the dictionary learning phase and the encoding phase. Important links between dictionary properties and classification accuracy are also identified, such as a strong correlation between reconstruction error and classification accuracy in all algorithms. These findings help the design of future dictionary-based MIR systems and the selection of important dictionary learning parameters. |
---|---|
AbstractList | Learning dictionaries from a large-scale music database is a burgeoning research topic in the music information retrieval (MIR) community. It has been shown that classification systems based on such learned features exhibit state-of-the-art accuracy in many music classification benchmarks. Although the general approach of dictionary-based MIR has been shown effective, little work has been done to investigate the relationship between system performance and dictionary properties, such as the dictionary sparsity, coherence, and conditional number of the dictionary. This paper aims at addressing this issue by systematically evaluating the performance of three types of dictionary learning algorithms for the task of genre classification, including the least-square based RLS (recursive least square) algorithm, and two variants of the stochastic gradient descent-based algorithm ODL (online dictionary learning) with different regularization functions. Specifically, we learn the dictionary with the USPOP2002 dataset and perform genre classification with the GTZAN dataset. Our result shows that setting strict sparsity constraints in the RLS-based dictionary learning (i.e., <;1% of the signal dimension) leads to better accuracy in genre classification (around 80% when linear kernel support vector classifier is adopted). Moreover, we find that different sparsity constraints are needed for the dictionary learning phase and the encoding phase. Important links between dictionary properties and classification accuracy are also identified, such as a strong correlation between reconstruction error and classification accuracy in all algorithms. These findings help the design of future dictionary-based MIR systems and the selection of important dictionary learning parameters. |
Author | Yi-Hsuan Yang Li Su Ping-Keng Jao |
Author_xml | – sequence: 1 surname: Ping-Keng Jao fullname: Ping-Keng Jao email: nafraw@citi.sinica.edu.tw organization: Res. Center for Inf. Technol. Innovation, Acad. Sinica, Taipei, Taiwan – sequence: 2 surname: Li Su fullname: Li Su email: lisu@citi.sinica.edu.tw organization: Res. Center for Inf. Technol. Innovation, Acad. Sinica, Taipei, Taiwan – sequence: 3 surname: Yi-Hsuan Yang fullname: Yi-Hsuan Yang email: yang@citi.sinica.edu.tw organization: Res. Center for Inf. Technol. Innovation, Acad. Sinica, Taipei, Taiwan |
BookMark | eNpNkMtqwzAURFVoF22aL8hGP2D3On5IWprQRyCQQLMP19JVKrBlIykL9-ub0iy6ms3MYWae2L0fPTG2KiAvClAv7eFze2jzNRRl3jSqWgt5x5ZKSCUbBQANVI_s0nrs52_nzzx9ETdOJzd6DDOfwjhRSI4iR294nDBEl2auRx9TQOdT5HYMHP-Fsg4jGT5cotP8TD4Q1z3G6KzT-OvhcY6Jhmf2YLGPtLzpgh3fXo-bj2y3f99u2l3mFKSsI6uI6toINNIKUZmirEAaVelaCtKlwevQ2hi0HUApa4GA0FhlEUyju3LBVn9YR0SnKbjh2vF0-6L8AUx-XVY |
ContentType | Conference Proceeding |
DBID | 6IE 6IL CBEJK RIE RIL |
DOI | 10.1109/APSIPA.2013.6694278 |
DatabaseName | IEEE Electronic Library (IEL) Conference Proceedings IEEE Proceedings Order Plan All Online (POP All Online) 1998-present by volume IEEE Xplore All Conference Proceedings IEEE Electronic Library (IEL) IEEE Proceedings Order Plans (POP All) 1998-Present |
DatabaseTitleList | |
Database_xml | – sequence: 1 dbid: RIE name: IEEE Electronic Library (IEL) url: https://proxy.k.utb.cz/login?url=https://ieeexplore.ieee.org/ sourceTypes: Publisher |
DeliveryMethod | fulltext_linktorsrc |
EISBN | 9789869000604 9869000606 |
EndPage | 8 |
ExternalDocumentID | 6694278 |
Genre | orig-research |
GroupedDBID | 6IE 6IL CBEJK RIE RIL |
ID | FETCH-LOGICAL-i90t-bef9ee55d7ad8f774d13408d94c587ec3da1105ddafb003857a0a06f9fa0d6cb3 |
IEDL.DBID | RIE |
IngestDate | Thu Jun 29 18:35:27 EDT 2023 |
IsPeerReviewed | false |
IsScholarly | false |
Language | English |
LinkModel | DirectLink |
MergedId | FETCHMERGED-LOGICAL-i90t-bef9ee55d7ad8f774d13408d94c587ec3da1105ddafb003857a0a06f9fa0d6cb3 |
PageCount | 8 |
ParticipantIDs | ieee_primary_6694278 |
PublicationCentury | 2000 |
PublicationDate | 2013-Oct. |
PublicationDateYYYYMMDD | 2013-10-01 |
PublicationDate_xml | – month: 10 year: 2013 text: 2013-Oct. |
PublicationDecade | 2010 |
PublicationTitle | 2013 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference |
PublicationTitleAbbrev | APSIPA |
PublicationYear | 2013 |
Publisher | APSIPA |
Publisher_xml | – name: APSIPA |
Score | 1.5699614 |
Snippet | Learning dictionaries from a large-scale music database is a burgeoning research topic in the music information retrieval (MIR) community. It has been shown... |
SourceID | ieee |
SourceType | Publisher |
StartPage | 1 |
SubjectTerms | Accuracy Algorithm design and analysis Classification algorithms Dictionaries Encoding Kernel Support vector machines |
Title | Analyzing the dictionary properties and sparsity constraints for a dictionary-based music genre classification system |
URI | https://ieeexplore.ieee.org/document/6694278 |
hasFullText | 1 |
inHoldings | 1 |
isFullTextHit | |
isPrint | |
link | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV1LSwMxEA61J08qrfgmB49mm-4jTY5FLFWoFKzQW5k8Foq4LXX3YH-9mexaH3jwtoRdssws-b5NvvmGkGtpnAJQmnnyAAwZNIM0tSyPjYcbaTiEcrHJoxg_pw_zbN4iN7taGOdcEJ-5CC_DWb5dmQq3ynpCKOwMsUf2JI_rWq3GSKjPVW84fbqfDlGtlUTNnT9apgTEGB2QyedctVDkJapKHZntLxvG_77MIel-1ebR6Q51jkjLFR1SBW-RrR-gntBRuwzVCrB5p2vcbN-gayqFwlK_fgQVBjXIC7E9RPlGPW-l8O0hhtBm6Su2gKbhA6MGSTaqikIiae3_3CWz0d3sdsyahgpsqXjJtMuVc1lmB2Bl7nmf7Scpl1alJpMDZxILPpCZtZDrcGI4AA5c5CoHboXRyTFpF6vCnRCqssTF0oDz_1upELHmfeOBH4xJtASRnZIORmyxri0zFk2wzv4ePif7mLVaI3dB2uWmcpce60t9FZL8AeyMsBY |
link.rule.ids | 310,311,786,790,795,796,802,27956,55107 |
linkProvider | IEEE |
linkToHtml | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV3PT8IwFG4QD3pSA8bf9uDRjsK2rj0SIwEFQiIm3MjrjyXEOAhuB_nrbbuJP-LB29Js6fLe0u9b-73vIXTDlREAQhJLHoA4Bk0gijRJO8rCDVcUfLnYaMz6z9HDLJ7V0O22FsYY48VnJnCX_ixfL1XhtspajAnXGWIH7Vqcp0lZrVVZCbWpaHUnT4NJ1-m1wqC690fTFI8ZvQM0-pytlIq8BEUuA7X5ZcT439c5RM2v6jw82eLOEaqZrIEK7y6ysQPYUjqsF75eAdbveOW229fONxVDprFdQbwOAyvHDF2DiPwNW-aK4dtDxIGbxq-uCTT2nxhWjmY7XZFPJS4doJto2ruf3vVJ1VKBLATNiTSpMCaOdQKap5b56XYYUa5FpGKeGBVqsIGMtYZU-jPDBChQlooUqGZKhseoni0zc4KwiEPT4QqM_eOKGOtI2lYW-kGpUHJg8SlquIjNV6VpxrwK1tnfw9dorz8dDefDwfjxHO27DJaKuQtUz9eFubTIn8srn_APWVuzag |
openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=proceeding&rft.title=2013+Asia-Pacific+Signal+and+Information+Processing+Association+Annual+Summit+and+Conference&rft.atitle=Analyzing+the+dictionary+properties+and+sparsity+constraints+for+a+dictionary-based+music+genre+classification+system&rft.au=Ping-Keng+Jao&rft.au=Li+Su&rft.au=Yi-Hsuan+Yang&rft.date=2013-10-01&rft.pub=APSIPA&rft.spage=1&rft.epage=8&rft_id=info:doi/10.1109%2FAPSIPA.2013.6694278&rft.externalDocID=6694278 |