Deep DNA Storage: Scalable and Robust DNA Storage via Coding Theory and Deep Learning
DNA-based storage is an emerging technology that enables digital information to be archived in DNA molecules. This method enjoys major advantages over magnetic and optical storage solutions such as exceptional information density, enhanced data durability, and negligible power consumption to maintai...
Saved in:
Published in | arXiv.org |
---|---|
Main Authors | , , , , |
Format | Paper |
Language | English |
Published |
Ithaca
Cornell University Library, arXiv.org
11.03.2024
|
Subjects | |
Online Access | Get full text |
Cover
Loading…
Abstract | DNA-based storage is an emerging technology that enables digital information to be archived in DNA molecules. This method enjoys major advantages over magnetic and optical storage solutions such as exceptional information density, enhanced data durability, and negligible power consumption to maintain data integrity. To access the data, an information retrieval process is employed, where some of the main bottlenecks are the scalability and accuracy, which have a natural tradeoff between the two. Here we show a modular and holistic approach that combines Deep Neural Networks (DNN) trained on simulated data, Tensor-Product (TP) based Error-Correcting Codes (ECC), and a safety margin mechanism into a single coherent pipeline. We demonstrated our solution on 3.1MB of information using two different sequencing technologies. Our work improves upon the current leading solutions by up to x3200 increase in speed, 40% improvement in accuracy, and offers a code rate of 1.6 bits per base in a high noise regime. In a broader sense, our work shows a viable path to commercial DNA storage solutions hindered by current information retrieval processes. |
---|---|
AbstractList | DNA-based storage is an emerging technology that enables digital information to be archived in DNA molecules. This method enjoys major advantages over magnetic and optical storage solutions such as exceptional information density, enhanced data durability, and negligible power consumption to maintain data integrity. To access the data, an information retrieval process is employed, where some of the main bottlenecks are the scalability and accuracy, which have a natural tradeoff between the two. Here we show a modular and holistic approach that combines Deep Neural Networks (DNN) trained on simulated data, Tensor-Product (TP) based Error-Correcting Codes (ECC), and a safety margin mechanism into a single coherent pipeline. We demonstrated our solution on 3.1MB of information using two different sequencing technologies. Our work improves upon the current leading solutions by up to x3200 increase in speed, 40% improvement in accuracy, and offers a code rate of 1.6 bits per base in a high noise regime. In a broader sense, our work shows a viable path to commercial DNA storage solutions hindered by current information retrieval processes. |
Author | Etzion, Tuvi Sabary, Omer Yaakobi, Eitan Orr, Itai Bar-Lev, Daniella |
Author_xml | – sequence: 1 givenname: Daniella surname: Bar-Lev fullname: Bar-Lev, Daniella – sequence: 2 givenname: Itai surname: Orr fullname: Orr, Itai – sequence: 3 givenname: Omer surname: Sabary fullname: Sabary, Omer – sequence: 4 givenname: Tuvi surname: Etzion fullname: Etzion, Tuvi – sequence: 5 givenname: Eitan surname: Yaakobi fullname: Yaakobi, Eitan |
BookMark | eNrjYmDJy89LZWLgNDI2NtS1MDEy4mDgLS7OMjAwMDIzNzI1NeZkCHVJTS1QcPFzVAguyS9KTE-1UghOTsxJTMpJVUjMS1EIyk8qLS5BVqBQlpmo4JyfkpmXrhCSkZpfVAlWCDbHJzWxKA8owcPAmpaYU5zKC6W5GZTdXEOcPXQLivILS1OLS-Kz8kuL8oBS8UamZhYmQBcZWBoTpwoA6C1APQ |
ContentType | Paper |
Copyright | 2024. This work is published under http://arxiv.org/licenses/nonexclusive-distrib/1.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License. |
Copyright_xml | – notice: 2024. This work is published under http://arxiv.org/licenses/nonexclusive-distrib/1.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License. |
DBID | 8FE 8FG ABJCF ABUWG AFKRA AZQEC BENPR BGLVJ CCPQU DWQXO HCIFZ L6V M7S PIMPY PQEST PQQKQ PQUKI PRINS PTHSS |
DatabaseName | ProQuest SciTech Collection ProQuest Technology Collection Materials Science & Engineering Collection ProQuest Central (Alumni) ProQuest Central UK/Ireland ProQuest Central Essentials ProQuest Central Technology Collection ProQuest One Community College ProQuest Central SciTech Premium Collection (Proquest) (PQ_SDU_P3) ProQuest Engineering Collection Engineering Database ProQuest Publicly Available Content database ProQuest One Academic Eastern Edition (DO NOT USE) ProQuest One Academic ProQuest One Academic UKI Edition ProQuest Central China Engineering Collection |
DatabaseTitle | Publicly Available Content Database Engineering Database Technology Collection ProQuest Central Essentials ProQuest One Academic Eastern Edition ProQuest Central (Alumni Edition) SciTech Premium Collection ProQuest One Community College ProQuest Technology Collection ProQuest SciTech Collection ProQuest Central China ProQuest Central ProQuest Engineering Collection ProQuest One Academic UKI Edition ProQuest Central Korea Materials Science & Engineering Collection ProQuest One Academic Engineering Collection |
DatabaseTitleList | Publicly Available Content Database |
Database_xml | – sequence: 1 dbid: 8FG name: ProQuest Technology Collection url: https://search.proquest.com/technologycollection1 sourceTypes: Aggregation Database |
DeliveryMethod | fulltext_linktorsrc |
Discipline | Physics |
EISSN | 2331-8422 |
Genre | Working Paper/Pre-Print |
GroupedDBID | 8FE 8FG ABJCF ABUWG AFKRA ALMA_UNASSIGNED_HOLDINGS AZQEC BENPR BGLVJ CCPQU DWQXO FRJ HCIFZ L6V M7S M~E PIMPY PQEST PQQKQ PQUKI PRINS PTHSS |
ID | FETCH-proquest_journals_25684000093 |
IEDL.DBID | 8FG |
IngestDate | Thu Oct 10 19:11:04 EDT 2024 |
IsOpenAccess | true |
IsPeerReviewed | false |
IsScholarly | false |
Language | English |
LinkModel | DirectLink |
MergedId | FETCHMERGED-proquest_journals_25684000093 |
OpenAccessLink | https://www.proquest.com/docview/2568400009?pq-origsite=%requestingapplication% |
PQID | 2568400009 |
PQPubID | 2050157 |
ParticipantIDs | proquest_journals_2568400009 |
PublicationCentury | 2000 |
PublicationDate | 20240311 |
PublicationDateYYYYMMDD | 2024-03-11 |
PublicationDate_xml | – month: 03 year: 2024 text: 20240311 day: 11 |
PublicationDecade | 2020 |
PublicationPlace | Ithaca |
PublicationPlace_xml | – name: Ithaca |
PublicationTitle | arXiv.org |
PublicationYear | 2024 |
Publisher | Cornell University Library, arXiv.org |
Publisher_xml | – name: Cornell University Library, arXiv.org |
SSID | ssj0002672553 |
Score | 3.529157 |
SecondaryResourceType | preprint |
Snippet | DNA-based storage is an emerging technology that enables digital information to be archived in DNA molecules. This method enjoys major advantages over magnetic... |
SourceID | proquest |
SourceType | Aggregation Database |
SubjectTerms | Artificial neural networks Clustering Codes Deoxyribonucleic acid DNA Error correcting codes Error correction Gene sequencing Information retrieval Machine learning Nanotechnology Robustness (mathematics) Storage systems Synthesis |
Title | Deep DNA Storage: Scalable and Robust DNA Storage via Coding Theory and Deep Learning |
URI | https://www.proquest.com/docview/2568400009 |
hasFullText | 1 |
inHoldings | 1 |
isFullTextHit | |
isPrint | |
link | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwfV3dS8MwED90RfDNT_yYI6CvxSXt0uqL6NY6hJWxOdjbyFfFF1vXzkf_di-xU0HYY0gIyZHc736XuxzAVR4Hsc5Z7osbY0uYMePH3EhfK2ODGaVQ7kuhUcaHs_Bp3ps3DreqCatc60SnqHWhrI_8GqEZuYg1Ce7Kd99WjbKvq00JjW3wKIsiS77i9PHHx8J4hBZz8E_NOuxI98Abi9Is92HLvB3Ajgu5VNUhzAbGlGSQ3ZMp8l681rdkigKzqUwE6T2ZFHJV1X8HkI9XQfqFRRvynVPvBrp5mn9SX47gMk2e-0N_vZZFc1qqxe_egmNoIe03J0BCobjistfVMgx1pIWmNKJacya7JlD8FNqbZjrb3H0Ouwzh2UZTUdqGVr1cmQuE11p2nAw74D0k2XiCrdFn8gVtg4S9 |
link.rule.ids | 783,787,12777,21400,33385,33756,43612,43817 |
linkProvider | ProQuest |
linkToHtml | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwfV1LT8MwDLZgE4IbT_EYEAmuFetjaeGC0EYpsFWIbdJuVR7uxIWOteP344QOkJB2jmUlVuLPdvwAuMwjP9K5lzviGs0IMw-diKN0tEKTzCiFsi2FBilPxsHTpDOpA25lnVa51IlWUetCmRj5FUEz-SLGJLidfThmapT5Xa1HaKxDM_AJq02lePzwE2PxeEgWs_9PzVrsiLeh-SJmON-BNXzfhQ2bcqnKPRj3EGesl96xIfm99Kxv2JAEZkqZGLn37LWQi7L6S8A-3wTrFgZt2HdNvSW0fOo-qdN9uIjvR93EWe4lq29Lmf2ezT-ABrn9eAgsEIorLjttLYNAh1po1w1drbkn2-grfgStVZyOVy-fw2YyGvSz_mP6fAJbHkG1yaxy3RY0qvkCTwlqK3lm5fkFbx2E1A |
openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Deep+DNA+Storage%3A+Scalable+and+Robust+DNA+Storage+via+Coding+Theory+and+Deep+Learning&rft.jtitle=arXiv.org&rft.au=Bar-Lev%2C+Daniella&rft.au=Orr%2C+Itai&rft.au=Sabary%2C+Omer&rft.au=Etzion%2C+Tuvi&rft.date=2024-03-11&rft.pub=Cornell+University+Library%2C+arXiv.org&rft.eissn=2331-8422 |