Toward a More Complete OMR Solution
Optical music recognition (OMR) aims to convert music notation into digital formats. One approach to tackle OMR is through a multi-stage pipeline, where the system first detects visual music notation elements in the image (object detection) and then assembles them into a music notation (notation ass...
Saved in:
Main Authors | , , , , |
---|---|
Format | Journal Article |
Language | English |
Published |
30.08.2024
|
Subjects | |
Online Access | Get full text |
Cover
Loading…
Abstract | Optical music recognition (OMR) aims to convert music notation into digital
formats. One approach to tackle OMR is through a multi-stage pipeline, where
the system first detects visual music notation elements in the image (object
detection) and then assembles them into a music notation (notation assembly).
Most previous work on notation assembly unrealistically assumes perfect object
detection. In this study, we focus on the MUSCIMA++ v2.0 dataset, which
represents musical notation as a graph with pairwise relationships among
detected music objects, and we consider both stages together. First, we
introduce a music object detector based on YOLOv8, which improves detection
performance. Second, we introduce a supervised training pipeline that completes
the notation assembly stage based on detection output. We find that this model
is able to outperform existing models trained on perfect detection output,
showing the benefit of considering the detection and assembly stages in a more
holistic way. These findings, together with our novel evaluation metric, are
important steps toward a more complete OMR solution. |
---|---|
AbstractList | Optical music recognition (OMR) aims to convert music notation into digital
formats. One approach to tackle OMR is through a multi-stage pipeline, where
the system first detects visual music notation elements in the image (object
detection) and then assembles them into a music notation (notation assembly).
Most previous work on notation assembly unrealistically assumes perfect object
detection. In this study, we focus on the MUSCIMA++ v2.0 dataset, which
represents musical notation as a graph with pairwise relationships among
detected music objects, and we consider both stages together. First, we
introduce a music object detector based on YOLOv8, which improves detection
performance. Second, we introduce a supervised training pipeline that completes
the notation assembly stage based on detection output. We find that this model
is able to outperform existing models trained on perfect detection output,
showing the benefit of considering the detection and assembly stages in a more
holistic way. These findings, together with our novel evaluation metric, are
important steps toward a more complete OMR solution. |
Author | Qiu, Lin Zhang, Muru Wan, Yanming Yang, Guang Smith, Noah A |
Author_xml | – sequence: 1 givenname: Guang surname: Yang fullname: Yang, Guang organization: Paul G. Allen School of Computer Science & Engineering, University of Washington, United States – sequence: 2 givenname: Muru surname: Zhang fullname: Zhang, Muru organization: Paul G. Allen School of Computer Science & Engineering, University of Washington, United States – sequence: 3 givenname: Lin surname: Qiu fullname: Qiu, Lin organization: Paul G. Allen School of Computer Science & Engineering, University of Washington, United States – sequence: 4 givenname: Yanming surname: Wan fullname: Wan, Yanming organization: Paul G. Allen School of Computer Science & Engineering, University of Washington, United States – sequence: 5 givenname: Noah A surname: Smith fullname: Smith, Noah A organization: Allen Institute for Artificial Intelligence, United States |
BackLink | https://doi.org/10.48550/arXiv.2409.00316$$DView paper in arXiv |
BookMark | eNrjYmDJy89LZWCQNDTQM7EwNTXQTyyqyCzTMzIxsNQzMDA2NONkUA7JL08sSlFIVPDNL0pVcM7PLchJLUlV8PcNUgjOzyktyczP42FgTUvMKU7lhdLcDPJuriHOHrpg4-ILijJzE4sq40HGxoONNSasAgBOMSzp |
ContentType | Journal Article |
Copyright | http://creativecommons.org/licenses/by/4.0 |
Copyright_xml | – notice: http://creativecommons.org/licenses/by/4.0 |
DBID | AKY GOX |
DOI | 10.48550/arxiv.2409.00316 |
DatabaseName | arXiv Computer Science arXiv.org |
DatabaseTitleList | |
Database_xml | – sequence: 1 dbid: GOX name: arXiv.org url: http://arxiv.org/find sourceTypes: Open Access Repository |
DeliveryMethod | fulltext_linktorsrc |
ExternalDocumentID | 2409_00316 |
GroupedDBID | AKY GOX |
ID | FETCH-arxiv_primary_2409_003163 |
IEDL.DBID | GOX |
IngestDate | Thu Sep 05 12:20:22 EDT 2024 |
IsDoiOpenAccess | true |
IsOpenAccess | true |
IsPeerReviewed | false |
IsScholarly | false |
Language | English |
LinkModel | DirectLink |
MergedId | FETCHMERGED-arxiv_primary_2409_003163 |
OpenAccessLink | https://arxiv.org/abs/2409.00316 |
ParticipantIDs | arxiv_primary_2409_00316 |
PublicationCentury | 2000 |
PublicationDate | 2024-08-30 |
PublicationDateYYYYMMDD | 2024-08-30 |
PublicationDate_xml | – month: 08 year: 2024 text: 2024-08-30 day: 30 |
PublicationDecade | 2020 |
PublicationYear | 2024 |
Score | 3.8648643 |
SecondaryResourceType | preprint |
Snippet | Optical music recognition (OMR) aims to convert music notation into digital
formats. One approach to tackle OMR is through a multi-stage pipeline, where
the... |
SourceID | arxiv |
SourceType | Open Access Repository |
SubjectTerms | Computer Science - Artificial Intelligence Computer Science - Computer Vision and Pattern Recognition |
Title | Toward a More Complete OMR Solution |
URI | https://arxiv.org/abs/2409.00316 |
hasFullText | 1 |
inHoldings | 1 |
isFullTextHit | |
isPrint | |
link | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwdVxNawIxEB10T16kRWXthw3oNbhuNsE9irgVYRVEYW-SbCbgxZatLf78JlmlvXhNhglMGN7LZOYBjKIEY8MN0lQZQxOJmk6nUlJuONdKTYQo3TRyvhbLfbIqeNEAcpuFkdXl-FPrA6uvsYUbJyfJJqIJzTh2LVvvm6L-nPRSXFf7PzvLMf3SP5DIHqB9ZXdkVl_HIzTw1IHhzremEknyjwqJy0AbLCSbfEtuVakuvGWL3XxJvdvDZ60BcXAnel1NwXoQ2Jc6hkAYM0JoTKTUxtKUUqWy1ApNKtACu2J9CO95ebq_9Qyt2CKpL2RGLxCcq298tUh4VgMfjl9VUGLS |
link.rule.ids | 228,230,786,891 |
linkProvider | Cornell University |
openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Toward+a+More+Complete+OMR+Solution&rft.au=Yang%2C+Guang&rft.au=Zhang%2C+Muru&rft.au=Qiu%2C+Lin&rft.au=Wan%2C+Yanming&rft.date=2024-08-30&rft_id=info:doi/10.48550%2Farxiv.2409.00316&rft.externalDocID=2409_00316 |