Learning a Joint Embedding Space of Monophonic and Mixed Music Signals for Singing Voice
Previous approaches in singer identification have used one of monophonic vocal tracks or mixed tracks containing multiple instruments, leaving a semantic gap between these two domains of audio. In this paper, we present a system to learn a joint embedding space of monophonic and mixed tracks for sin...
Saved in:
Published in | arXiv.org |
---|---|
Main Authors | , |
Format | Paper |
Language | English |
Published |
Ithaca
Cornell University Library, arXiv.org
26.06.2019
|
Subjects | |
Online Access | Get full text |
Cover
Loading…
Abstract | Previous approaches in singer identification have used one of monophonic vocal tracks or mixed tracks containing multiple instruments, leaving a semantic gap between these two domains of audio. In this paper, we present a system to learn a joint embedding space of monophonic and mixed tracks for singing voice. We use a metric learning method, which ensures that tracks from both domains of the same singer are mapped closer to each other than those of different singers. We train the system on a large synthetic dataset generated by music mashup to reflect real-world music recordings. Our approach opens up new possibilities for cross-domain tasks, e.g., given a monophonic track of a singer as a query, retrieving mixed tracks sung by the same singer from the database. Also, it requires no additional vocal enhancement steps such as source separation. We show the effectiveness of our system for singer identification and query-by-singer in both the same-domain and cross-domain tasks. |
---|---|
AbstractList | Previous approaches in singer identification have used one of monophonic vocal tracks or mixed tracks containing multiple instruments, leaving a semantic gap between these two domains of audio. In this paper, we present a system to learn a joint embedding space of monophonic and mixed tracks for singing voice. We use a metric learning method, which ensures that tracks from both domains of the same singer are mapped closer to each other than those of different singers. We train the system on a large synthetic dataset generated by music mashup to reflect real-world music recordings. Our approach opens up new possibilities for cross-domain tasks, e.g., given a monophonic track of a singer as a query, retrieving mixed tracks sung by the same singer from the database. Also, it requires no additional vocal enhancement steps such as source separation. We show the effectiveness of our system for singer identification and query-by-singer in both the same-domain and cross-domain tasks. |
Author | Lee, Kyungyun Nam, Juhan |
Author_xml | – sequence: 1 givenname: Kyungyun surname: Lee fullname: Lee, Kyungyun – sequence: 2 givenname: Juhan surname: Nam fullname: Nam, Juhan |
BookMark | eNqNS90KgjAYHVGQle8w6FqYm6ZdhxGRV0V0J0unTer7bFPo8ZvQA3Rz_s-CTAFBTYjHhQiDNOJ8TnxrW8YY3yQ8joVHbiclDWhoqKRH1NDT7HVXVTUm506WimJNcwTsHgi6pBIqmuuPcjhY58-6Afm0tEbjNDTj74q6VCsyq12h_B8vyXqfXXaHoDP4HpTtixYHM34LzqOUhVuRhOK_1RdtdUJ8 |
ContentType | Paper |
Copyright | 2019. This work is published under http://creativecommons.org/licenses/by/4.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License. |
Copyright_xml | – notice: 2019. This work is published under http://creativecommons.org/licenses/by/4.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License. |
DBID | 8FE 8FG ABJCF ABUWG AFKRA AZQEC BENPR BGLVJ CCPQU DWQXO HCIFZ L6V M7S PIMPY PQEST PQQKQ PQUKI PRINS PTHSS |
DatabaseName | ProQuest SciTech Collection ProQuest Technology Collection Materials Science & Engineering Collection ProQuest Central (Alumni) ProQuest Central ProQuest Central Essentials ProQuest Central Technology Collection ProQuest One Community College ProQuest Central SciTech Premium Collection ProQuest Engineering Collection Engineering Database Publicly Available Content Database ProQuest One Academic Eastern Edition (DO NOT USE) ProQuest One Academic ProQuest One Academic UKI Edition ProQuest Central China Engineering Collection |
DatabaseTitle | Publicly Available Content Database Engineering Database Technology Collection ProQuest Central Essentials ProQuest One Academic Eastern Edition ProQuest Central (Alumni Edition) SciTech Premium Collection ProQuest One Community College ProQuest Technology Collection ProQuest SciTech Collection ProQuest Central China ProQuest Central ProQuest Engineering Collection ProQuest One Academic UKI Edition ProQuest Central Korea Materials Science & Engineering Collection ProQuest One Academic Engineering Collection |
DatabaseTitleList | Publicly Available Content Database |
Database_xml | – sequence: 1 dbid: 8FG name: ProQuest Technology Collection url: https://search.proquest.com/technologycollection1 sourceTypes: Aggregation Database |
DeliveryMethod | fulltext_linktorsrc |
Discipline | Music Physics |
EISSN | 2331-8422 |
Genre | Working Paper/Pre-Print |
GroupedDBID | 8FE 8FG ABJCF ABUWG AFKRA ALMA_UNASSIGNED_HOLDINGS AZQEC BENPR BGLVJ CCPQU DWQXO FRJ HCIFZ L6V M7S M~E PIMPY PQEST PQQKQ PQUKI PRINS PTHSS |
ID | FETCH-proquest_journals_22480193713 |
IEDL.DBID | 8FG |
IngestDate | Thu Oct 10 18:20:41 EDT 2024 |
IsOpenAccess | true |
IsPeerReviewed | false |
IsScholarly | false |
Language | English |
LinkModel | DirectLink |
MergedId | FETCHMERGED-proquest_journals_22480193713 |
OpenAccessLink | https://www.proquest.com/docview/2248019371?pq-origsite=%requestingapplication% |
PQID | 2248019371 |
PQPubID | 2050157 |
ParticipantIDs | proquest_journals_2248019371 |
PublicationCentury | 2000 |
PublicationDate | 20190626 |
PublicationDateYYYYMMDD | 2019-06-26 |
PublicationDate_xml | – month: 06 year: 2019 text: 20190626 day: 26 |
PublicationDecade | 2010 |
PublicationPlace | Ithaca |
PublicationPlace_xml | – name: Ithaca |
PublicationTitle | arXiv.org |
PublicationYear | 2019 |
Publisher | Cornell University Library, arXiv.org |
Publisher_xml | – name: Cornell University Library, arXiv.org |
SSID | ssj0002672553 |
Score | 3.2135289 |
SecondaryResourceType | preprint |
Snippet | Previous approaches in singer identification have used one of monophonic vocal tracks or mixed tracks containing multiple instruments, leaving a semantic gap... |
SourceID | proquest |
SourceType | Aggregation Database |
SubjectTerms | Domains Embedded systems Embedding Learning Music Musical performances Musicians & conductors Singers Singing |
Title | Learning a Joint Embedding Space of Monophonic and Mixed Music Signals for Singing Voice |
URI | https://www.proquest.com/docview/2248019371 |
hasFullText | 1 |
inHoldings | 1 |
isFullTextHit | |
isPrint | |
link | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwfV3NS8MwFH_oiuBNp-LHHAG9FtOPpelJUFrHoGM4ld5Gk6alB9u5VvDk3-5L6PQg7BISAo_kkfeZx-8B3HpMUFdmvi2EzlYpR9hcup4tOJc8pIrmphgzmbPpqz9LJ2mfcGv7ssqtTjSKOm-kzpHfoalBZarR2-7XH7buGqV_V_sWGvtgOW4Q6OCLx0-_ORaXBegxe__UrLEd8RFYi2ytNsewp-ohWKax8hAOTOmlbE8g7SFOS5KRWVPVHYnehcq1TSFLjGgVaQqCotfoKvJKEgz9SVJ9KRw1KbKsSo2BTND7xLmGFyzJW4Pyfwo3cfTyOLW3x1r1D6dd_V3TO4NB3dTqHEjhh0GRi4wKjJackAvqZzRnAUMmq0noXcBoF6XL3dtXcIhegMYisF02gkG3-VTXaGk7MTbsHIP1EM0Xz7hKvqMfu6mH-Q |
link.rule.ids | 783,787,12777,21400,33385,33756,43612,43817 |
linkProvider | ProQuest |
linkToHtml | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwfV3fS8MwED50RfRNp-KPqQF9DXZtl7ZPPshqnesQNqVvpWnT0QebuVbwz_eudPog7CUEAkcSkvvuLpfvAO5sIU0rSx0uJUWr1FByL7NsLj0v83xTmXmbjBnNRPjmTOJR3AXc6i6tcqMTW0Wd64xi5PcINahMib3tYfXJqWoUva52JTR2wXBsxGr6KR48_cZYLOGixWz_U7MtdgSHYLymK7U-gh1V9cFoCyv3Ya9NvczqY4g7itMlS9lEl1XDxh9S5YQpbI4erWK6YHj1NGWRlxlD159F5bfClkSxebkkDmSG1if2iV5wyd413v8TuA3Gi8eQb6aVdAenTv6WaZ9Cr9KVOgNWOL5b5DI1JXpLQ9-TppOauXAFbrIa-fY5DLZJutg-fAP74SKaJtPn2cslHKBFQLwE3BID6DXrL3WFqNvI63ZrfwC114gQ |
openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Learning+a+Joint+Embedding+Space+of+Monophonic+and+Mixed+Music+Signals+for+Singing+Voice&rft.jtitle=arXiv.org&rft.au=Lee%2C+Kyungyun&rft.au=Nam%2C+Juhan&rft.date=2019-06-26&rft.pub=Cornell+University+Library%2C+arXiv.org&rft.eissn=2331-8422 |