Towards Robust Local Key Estimation with a Musically Inspired Neural Network
Local key estimation from music audio recordings is a challenging task. Due to its complexity and inherent ambiguity, machine-learning methods often overfit to specific pieces and their annotations, therefore lacking robustness and generalizability. Based on a previous case study on the Schubert Win...
Saved in:
Published in | 2024 32nd European Signal Processing Conference (EUSIPCO) pp. 26 - 30 |
---|---|
Main Authors | , |
Format | Conference Proceeding |
Language | English |
Published |
European Association for Signal Processing - EURASIP
26.08.2024
|
Subjects | |
Online Access | Get full text |
Cover
Loading…
Abstract | Local key estimation from music audio recordings is a challenging task. Due to its complexity and inherent ambiguity, machine-learning methods often overfit to specific pieces and their annotations, therefore lacking robustness and generalizability. Based on a previous case study on the Schubert Winterreise dataset, this paper aims to build a robust local key estimation methods. To this end, we propose a novel neural network architecture (OctaveNet), which is inspired by the musical relationship of frequency bins in the constant-Q transform (CQT) and the ability of recurrent layers to process sequential data. OctaveNet rearranges the CQT spectrogram in two different ways, processes each of the branches with convolutional and recurrent layers, and finally fuses the two feature maps to predict the local key. Our results show that, while having fewer parameters, OctaveNet achieves a substantial improvement over previous methods, especially for unseen songs, which indicates its stronger generalizability. |
---|---|
AbstractList | Local key estimation from music audio recordings is a challenging task. Due to its complexity and inherent ambiguity, machine-learning methods often overfit to specific pieces and their annotations, therefore lacking robustness and generalizability. Based on a previous case study on the Schubert Winterreise dataset, this paper aims to build a robust local key estimation methods. To this end, we propose a novel neural network architecture (OctaveNet), which is inspired by the musical relationship of frequency bins in the constant-Q transform (CQT) and the ability of recurrent layers to process sequential data. OctaveNet rearranges the CQT spectrogram in two different ways, processes each of the branches with convolutional and recurrent layers, and finally fuses the two feature maps to predict the local key. Our results show that, while having fewer parameters, OctaveNet achieves a substantial improvement over previous methods, especially for unseen songs, which indicates its stronger generalizability. |
Author | WeiB, Christof Ding, Yiwei |
Author_xml | – sequence: 1 givenname: Yiwei surname: Ding fullname: Ding, Yiwei organization: Georgia Institute of Technology,Music Informatics Group,Atlanta,GA,USA – sequence: 2 givenname: Christof surname: WeiB fullname: WeiB, Christof organization: Georgia Institute of Technology,Music Informatics Group,Atlanta,GA,USA |
BookMark | eNqFjLsKwjAUQKMo-PwDh_sDhaSmaTtLRVE7SAc3ifaKwdpIbkrp39vB3ekM53BmbFTbGgdslkolo3StxGXIpiGPVSCkiiZsSWRuPEx4EguupuxY2Fa7kuBsbw15ONq7ruCAHWTkzVt7Y2tojX-ChlNDprdVB_uaPsZhCTk2ru9z9K11rwUbP3RFuPxxzlbbrNjsAoOI14_rf667Ch6LKJTp-o_-AvXyO_8 |
ContentType | Conference Proceeding |
DBID | 6IE 6IL CBEJK RIE RIL |
DatabaseName | IEEE Electronic Library (IEL) Conference Proceedings IEEE Proceedings Order Plan All Online (POP All Online) 1998-present by volume IEEE Xplore All Conference Proceedings IEL IEEE Proceedings Order Plans (POP All) 1998-Present |
DatabaseTitleList | |
Database_xml | – sequence: 1 dbid: RIE name: IEL url: https://proxy.k.utb.cz/login?url=https://ieeexplore.ieee.org/ sourceTypes: Publisher |
DeliveryMethod | fulltext_linktorsrc |
Discipline | Music |
EISBN | 946459361X 9789464593617 |
EISSN | 2076-1465 |
EndPage | 30 |
ExternalDocumentID | 10715249 |
Genre | orig-research |
GroupedDBID | 6IE 6IL ALMA_UNASSIGNED_HOLDINGS CBEJK RIE RIL |
ID | FETCH-ieee_primary_107152493 |
IEDL.DBID | RIE |
IngestDate | Wed Oct 30 05:55:21 EDT 2024 |
IsPeerReviewed | false |
IsScholarly | false |
Language | English |
LinkModel | DirectLink |
MergedId | FETCHMERGED-ieee_primary_107152493 |
ParticipantIDs | ieee_primary_10715249 |
PublicationCentury | 2000 |
PublicationDate | 2024-Aug.-26 |
PublicationDateYYYYMMDD | 2024-08-26 |
PublicationDate_xml | – month: 08 year: 2024 text: 2024-Aug.-26 day: 26 |
PublicationDecade | 2020 |
PublicationTitle | 2024 32nd European Signal Processing Conference (EUSIPCO) |
PublicationTitleAbbrev | EUSIPCO |
PublicationYear | 2024 |
Publisher | European Association for Signal Processing - EURASIP |
Publisher_xml | – name: European Association for Signal Processing - EURASIP |
SSID | ssib028087106 ssib025355106 ssib023431665 |
Score | 3.866993 |
Snippet | Local key estimation from music audio recordings is a challenging task. Due to its complexity and inherent ambiguity, machine-learning methods often overfit to... |
SourceID | ieee |
SourceType | Publisher |
StartPage | 26 |
SubjectTerms | Convolution deep learning Estimation Fuses local key estimation Machine learning Multiple signal classification Music music information retrieval Neural networks Robustness Spectrogram tonal analysis Transforms |
Title | Towards Robust Local Key Estimation with a Musically Inspired Neural Network |
URI | https://ieeexplore.ieee.org/document/10715249 |
hasFullText | 1 |
inHoldings | 1 |
isFullTextHit | |
isPrint | |
link | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV07T8MwED7RsjDxCuJRkAfWRMGx85hRqwIlQqhI3So7vixUKaLJUH49PofQCoHEZMvy4yTb-uy7--4ArhXaR4fByLeHAX2h9I2vpDG-xEzTjpfo0rc95vH4RdzP5OyLrO64MIjonM8woKqz5Ztl0ZCqzN7wxMKNyHrQS7KsJWt1h4dHROremOy4tEi6ZULkaWj_BpTYaCuLigOR0T7k3fKt78hr0NQ6KD5-RGb8t3wH4G34euzpG4kOYQerI9h1KZyPYTJ1nrEr9rzUzapmE0Iv9oBrNrTXu2UuMlLHMsXcELVYrNldRSZ4NIyid9j-eesu7sFgNJzejn0Sa_7WhqqYdxJFJ9CvlhWeAkswLqSRUahiI1JbpGlUapSCh6VIDD8D79cpzv9ov4A9boGe9Kw8HkC_fm_w0gJ1ra_cBn0CC4iXpg |
link.rule.ids | 310,311,783,787,792,793,799,23942,23943,25152,55086 |
linkProvider | IEEE |
linkToHtml | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV07T8MwED5BO8DEK4hHAQ-siYJjJ-mMWqU0jRAKUrfIiS8LVYraZCi_HtshtEIgMdmybOuks_XZd_fdAdwLVI8OiZ6tDgPaTOQPtuBS2hyHudZ4iaZ82yzxo1f2NOfzL7K64cIgogk-Q0d3jS9fLotGm8rUDQ8U3LDhPvTVwzr0W7pWd3yop2ndW6cd5QpLd5yINHTV70CXNtqpo2JgZHwESSdAGz3y5jR17hQfP3Iz_lvCY7C2jD3y_I1FJ7CH1Sn0TRHnM4hTExu7Ji_LvFnXJNb4Raa4ISN1wVvuItEGWSKIWSIWiw2ZVNoJj5Lo_B1qftIGjFswGI_Sx8jWYmXvbbKKrJPIO4detazwAkiAfsEl91zhSxaqJgy9MkfOqFuyQNJLsH7d4uqP8Ts4iNJZnMWTZHoNh1TBvra6Un8AvXrV4I2C7Tq_Ncr6BCgQmvE |
openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=proceeding&rft.title=2024+32nd+European+Signal+Processing+Conference+%28EUSIPCO%29&rft.atitle=Towards+Robust+Local+Key+Estimation+with+a+Musically+Inspired+Neural+Network&rft.au=Ding%2C+Yiwei&rft.au=WeiB%2C+Christof&rft.date=2024-08-26&rft.pub=European+Association+for+Signal+Processing+-+EURASIP&rft.eissn=2076-1465&rft.spage=26&rft.epage=30&rft.externalDocID=10715249 |