Towards Robust Local Key Estimation with a Musically Inspired Neural Network

Local key estimation from music audio recordings is a challenging task. Due to its complexity and inherent ambiguity, machine-learning methods often overfit to specific pieces and their annotations, therefore lacking robustness and generalizability. Based on a previous case study on the Schubert Win...

Full description

Saved in:
Bibliographic Details
Published in2024 32nd European Signal Processing Conference (EUSIPCO) pp. 26 - 30
Main Authors Ding, Yiwei, WeiB, Christof
Format Conference Proceeding
LanguageEnglish
Published European Association for Signal Processing - EURASIP 26.08.2024
Subjects
Online AccessGet full text

Cover

Loading…
Abstract Local key estimation from music audio recordings is a challenging task. Due to its complexity and inherent ambiguity, machine-learning methods often overfit to specific pieces and their annotations, therefore lacking robustness and generalizability. Based on a previous case study on the Schubert Winterreise dataset, this paper aims to build a robust local key estimation methods. To this end, we propose a novel neural network architecture (OctaveNet), which is inspired by the musical relationship of frequency bins in the constant-Q transform (CQT) and the ability of recurrent layers to process sequential data. OctaveNet rearranges the CQT spectrogram in two different ways, processes each of the branches with convolutional and recurrent layers, and finally fuses the two feature maps to predict the local key. Our results show that, while having fewer parameters, OctaveNet achieves a substantial improvement over previous methods, especially for unseen songs, which indicates its stronger generalizability.
AbstractList Local key estimation from music audio recordings is a challenging task. Due to its complexity and inherent ambiguity, machine-learning methods often overfit to specific pieces and their annotations, therefore lacking robustness and generalizability. Based on a previous case study on the Schubert Winterreise dataset, this paper aims to build a robust local key estimation methods. To this end, we propose a novel neural network architecture (OctaveNet), which is inspired by the musical relationship of frequency bins in the constant-Q transform (CQT) and the ability of recurrent layers to process sequential data. OctaveNet rearranges the CQT spectrogram in two different ways, processes each of the branches with convolutional and recurrent layers, and finally fuses the two feature maps to predict the local key. Our results show that, while having fewer parameters, OctaveNet achieves a substantial improvement over previous methods, especially for unseen songs, which indicates its stronger generalizability.
Author WeiB, Christof
Ding, Yiwei
Author_xml – sequence: 1
  givenname: Yiwei
  surname: Ding
  fullname: Ding, Yiwei
  organization: Georgia Institute of Technology,Music Informatics Group,Atlanta,GA,USA
– sequence: 2
  givenname: Christof
  surname: WeiB
  fullname: WeiB, Christof
  organization: Georgia Institute of Technology,Music Informatics Group,Atlanta,GA,USA
BookMark eNqFjLsKwjAUQKMo-PwDh_sDhaSmaTtLRVE7SAc3ifaKwdpIbkrp39vB3ekM53BmbFTbGgdslkolo3StxGXIpiGPVSCkiiZsSWRuPEx4EguupuxY2Fa7kuBsbw15ONq7ruCAHWTkzVt7Y2tojX-ChlNDprdVB_uaPsZhCTk2ru9z9K11rwUbP3RFuPxxzlbbrNjsAoOI14_rf667Ch6LKJTp-o_-AvXyO_8
ContentType Conference Proceeding
DBID 6IE
6IL
CBEJK
RIE
RIL
DatabaseName IEEE Electronic Library (IEL) Conference Proceedings
IEEE Proceedings Order Plan All Online (POP All Online) 1998-present by volume
IEEE Xplore All Conference Proceedings
IEL
IEEE Proceedings Order Plans (POP All) 1998-Present
DatabaseTitleList
Database_xml – sequence: 1
  dbid: RIE
  name: IEL
  url: https://proxy.k.utb.cz/login?url=https://ieeexplore.ieee.org/
  sourceTypes: Publisher
DeliveryMethod fulltext_linktorsrc
Discipline Music
EISBN 946459361X
9789464593617
EISSN 2076-1465
EndPage 30
ExternalDocumentID 10715249
Genre orig-research
GroupedDBID 6IE
6IL
ALMA_UNASSIGNED_HOLDINGS
CBEJK
RIE
RIL
ID FETCH-ieee_primary_107152493
IEDL.DBID RIE
IngestDate Wed Oct 30 05:55:21 EDT 2024
IsPeerReviewed false
IsScholarly false
Language English
LinkModel DirectLink
MergedId FETCHMERGED-ieee_primary_107152493
ParticipantIDs ieee_primary_10715249
PublicationCentury 2000
PublicationDate 2024-Aug.-26
PublicationDateYYYYMMDD 2024-08-26
PublicationDate_xml – month: 08
  year: 2024
  text: 2024-Aug.-26
  day: 26
PublicationDecade 2020
PublicationTitle 2024 32nd European Signal Processing Conference (EUSIPCO)
PublicationTitleAbbrev EUSIPCO
PublicationYear 2024
Publisher European Association for Signal Processing - EURASIP
Publisher_xml – name: European Association for Signal Processing - EURASIP
SSID ssib028087106
ssib025355106
ssib023431665
Score 3.866993
Snippet Local key estimation from music audio recordings is a challenging task. Due to its complexity and inherent ambiguity, machine-learning methods often overfit to...
SourceID ieee
SourceType Publisher
StartPage 26
SubjectTerms Convolution
deep learning
Estimation
Fuses
local key estimation
Machine learning
Multiple signal classification
Music
music information retrieval
Neural networks
Robustness
Spectrogram
tonal analysis
Transforms
Title Towards Robust Local Key Estimation with a Musically Inspired Neural Network
URI https://ieeexplore.ieee.org/document/10715249
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
link http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV07T8MwED7RsjDxCuJRkAfWRMGx85hRqwIlQqhI3So7vixUKaLJUH49PofQCoHEZMvy4yTb-uy7--4ArhXaR4fByLeHAX2h9I2vpDG-xEzTjpfo0rc95vH4RdzP5OyLrO64MIjonM8woKqz5Ztl0ZCqzN7wxMKNyHrQS7KsJWt1h4dHROremOy4tEi6ZULkaWj_BpTYaCuLigOR0T7k3fKt78hr0NQ6KD5-RGb8t3wH4G34euzpG4kOYQerI9h1KZyPYTJ1nrEr9rzUzapmE0Iv9oBrNrTXu2UuMlLHMsXcELVYrNldRSZ4NIyid9j-eesu7sFgNJzejn0Sa_7WhqqYdxJFJ9CvlhWeAkswLqSRUahiI1JbpGlUapSCh6VIDD8D79cpzv9ov4A9boGe9Kw8HkC_fm_w0gJ1ra_cBn0CC4iXpg
link.rule.ids 310,311,783,787,792,793,799,23942,23943,25152,55086
linkProvider IEEE
linkToHtml http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV07T8MwED5BO8DEK4hHAQ-siYJjJ-mMWqU0jRAKUrfIiS8LVYraZCi_HtshtEIgMdmybOuks_XZd_fdAdwLVI8OiZ6tDgPaTOQPtuBS2hyHudZ4iaZ82yzxo1f2NOfzL7K64cIgogk-Q0d3jS9fLotGm8rUDQ8U3LDhPvTVwzr0W7pWd3yop2ndW6cd5QpLd5yINHTV70CXNtqpo2JgZHwESSdAGz3y5jR17hQfP3Iz_lvCY7C2jD3y_I1FJ7CH1Sn0TRHnM4hTExu7Ji_LvFnXJNb4Raa4ISN1wVvuItEGWSKIWSIWiw2ZVNoJj5Lo_B1qftIGjFswGI_Sx8jWYmXvbbKKrJPIO4detazwAkiAfsEl91zhSxaqJgy9MkfOqFuyQNJLsH7d4uqP8Ts4iNJZnMWTZHoNh1TBvra6Un8AvXrV4I2C7Tq_Ncr6BCgQmvE
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=proceeding&rft.title=2024+32nd+European+Signal+Processing+Conference+%28EUSIPCO%29&rft.atitle=Towards+Robust+Local+Key+Estimation+with+a+Musically+Inspired+Neural+Network&rft.au=Ding%2C+Yiwei&rft.au=WeiB%2C+Christof&rft.date=2024-08-26&rft.pub=European+Association+for+Signal+Processing+-+EURASIP&rft.eissn=2076-1465&rft.spage=26&rft.epage=30&rft.externalDocID=10715249