Toward a More Complete OMR Solution

Optical music recognition (OMR) aims to convert music notation into digital formats. One approach to tackle OMR is through a multi-stage pipeline, where the system first detects visual music notation elements in the image (object detection) and then assembles them into a music notation (notation ass...

Full description

Saved in:
Bibliographic Details
Main Authors Yang, Guang, Zhang, Muru, Qiu, Lin, Wan, Yanming, Smith, Noah A
Format Journal Article
LanguageEnglish
Published 30.08.2024
Subjects
Online AccessGet full text

Cover

Loading…
Abstract Optical music recognition (OMR) aims to convert music notation into digital formats. One approach to tackle OMR is through a multi-stage pipeline, where the system first detects visual music notation elements in the image (object detection) and then assembles them into a music notation (notation assembly). Most previous work on notation assembly unrealistically assumes perfect object detection. In this study, we focus on the MUSCIMA++ v2.0 dataset, which represents musical notation as a graph with pairwise relationships among detected music objects, and we consider both stages together. First, we introduce a music object detector based on YOLOv8, which improves detection performance. Second, we introduce a supervised training pipeline that completes the notation assembly stage based on detection output. We find that this model is able to outperform existing models trained on perfect detection output, showing the benefit of considering the detection and assembly stages in a more holistic way. These findings, together with our novel evaluation metric, are important steps toward a more complete OMR solution.
AbstractList Optical music recognition (OMR) aims to convert music notation into digital formats. One approach to tackle OMR is through a multi-stage pipeline, where the system first detects visual music notation elements in the image (object detection) and then assembles them into a music notation (notation assembly). Most previous work on notation assembly unrealistically assumes perfect object detection. In this study, we focus on the MUSCIMA++ v2.0 dataset, which represents musical notation as a graph with pairwise relationships among detected music objects, and we consider both stages together. First, we introduce a music object detector based on YOLOv8, which improves detection performance. Second, we introduce a supervised training pipeline that completes the notation assembly stage based on detection output. We find that this model is able to outperform existing models trained on perfect detection output, showing the benefit of considering the detection and assembly stages in a more holistic way. These findings, together with our novel evaluation metric, are important steps toward a more complete OMR solution.
Author Qiu, Lin
Zhang, Muru
Wan, Yanming
Yang, Guang
Smith, Noah A
Author_xml – sequence: 1
  givenname: Guang
  surname: Yang
  fullname: Yang, Guang
  organization: Paul G. Allen School of Computer Science & Engineering, University of Washington, United States
– sequence: 2
  givenname: Muru
  surname: Zhang
  fullname: Zhang, Muru
  organization: Paul G. Allen School of Computer Science & Engineering, University of Washington, United States
– sequence: 3
  givenname: Lin
  surname: Qiu
  fullname: Qiu, Lin
  organization: Paul G. Allen School of Computer Science & Engineering, University of Washington, United States
– sequence: 4
  givenname: Yanming
  surname: Wan
  fullname: Wan, Yanming
  organization: Paul G. Allen School of Computer Science & Engineering, University of Washington, United States
– sequence: 5
  givenname: Noah A
  surname: Smith
  fullname: Smith, Noah A
  organization: Allen Institute for Artificial Intelligence, United States
BackLink https://doi.org/10.48550/arXiv.2409.00316$$DView paper in arXiv
BookMark eNrjYmDJy89LZWCQNDTQM7EwNTXQTyyqyCzTMzIxsNQzMDA2NONkUA7JL08sSlFIVPDNL0pVcM7PLchJLUlV8PcNUgjOzyktyczP42FgTUvMKU7lhdLcDPJuriHOHrpg4-ILijJzE4sq40HGxoONNSasAgBOMSzp
ContentType Journal Article
Copyright http://creativecommons.org/licenses/by/4.0
Copyright_xml – notice: http://creativecommons.org/licenses/by/4.0
DBID AKY
GOX
DOI 10.48550/arxiv.2409.00316
DatabaseName arXiv Computer Science
arXiv.org
DatabaseTitleList
Database_xml – sequence: 1
  dbid: GOX
  name: arXiv.org
  url: http://arxiv.org/find
  sourceTypes: Open Access Repository
DeliveryMethod fulltext_linktorsrc
ExternalDocumentID 2409_00316
GroupedDBID AKY
GOX
ID FETCH-arxiv_primary_2409_003163
IEDL.DBID GOX
IngestDate Thu Sep 05 12:20:22 EDT 2024
IsDoiOpenAccess true
IsOpenAccess true
IsPeerReviewed false
IsScholarly false
Language English
LinkModel DirectLink
MergedId FETCHMERGED-arxiv_primary_2409_003163
OpenAccessLink https://arxiv.org/abs/2409.00316
ParticipantIDs arxiv_primary_2409_00316
PublicationCentury 2000
PublicationDate 2024-08-30
PublicationDateYYYYMMDD 2024-08-30
PublicationDate_xml – month: 08
  year: 2024
  text: 2024-08-30
  day: 30
PublicationDecade 2020
PublicationYear 2024
Score 3.8648643
SecondaryResourceType preprint
Snippet Optical music recognition (OMR) aims to convert music notation into digital formats. One approach to tackle OMR is through a multi-stage pipeline, where the...
SourceID arxiv
SourceType Open Access Repository
SubjectTerms Computer Science - Artificial Intelligence
Computer Science - Computer Vision and Pattern Recognition
Title Toward a More Complete OMR Solution
URI https://arxiv.org/abs/2409.00316
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
link http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwdVxNawIxEB10T16kRWXthw3oNbhuNsE9irgVYRVEYW-SbCbgxZatLf78JlmlvXhNhglMGN7LZOYBjKIEY8MN0lQZQxOJmk6nUlJuONdKTYQo3TRyvhbLfbIqeNEAcpuFkdXl-FPrA6uvsYUbJyfJJqIJzTh2LVvvm6L-nPRSXFf7PzvLMf3SP5DIHqB9ZXdkVl_HIzTw1IHhzremEknyjwqJy0AbLCSbfEtuVakuvGWL3XxJvdvDZ60BcXAnel1NwXoQ2Jc6hkAYM0JoTKTUxtKUUqWy1ApNKtACu2J9CO95ebq_9Qyt2CKpL2RGLxCcq298tUh4VgMfjl9VUGLS
link.rule.ids 228,230,786,891
linkProvider Cornell University
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Toward+a+More+Complete+OMR+Solution&rft.au=Yang%2C+Guang&rft.au=Zhang%2C+Muru&rft.au=Qiu%2C+Lin&rft.au=Wan%2C+Yanming&rft.date=2024-08-30&rft_id=info:doi/10.48550%2Farxiv.2409.00316&rft.externalDocID=2409_00316