A robust and efficient DNA storage architecture based on modulation encoding and decoding

Thanks to its high density and long durability, synthetic DNA has been widely considered as a promising solution to the data explosion problem. However, due to the large amount of random base insertion-deletion-substitution (IDSs) errors from sequencing, reliable data recovery remains a critical cha...

Full description

Saved in:
Bibliographic Details
Published inbioRxiv
Main Authors Xiangzhen Zan, Xie, Ranze, Yao, Xiangyu, Xu, Peng, Liu, Wenbin
Format Paper
LanguageEnglish
Published Cold Spring Harbor Cold Spring Harbor Laboratory Press 29.07.2022
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:Thanks to its high density and long durability, synthetic DNA has been widely considered as a promising solution to the data explosion problem. However, due to the large amount of random base insertion-deletion-substitution (IDSs) errors from sequencing, reliable data recovery remains a critical challenge, which hinders its large-scale application. Here, we propose a modulation-based DNA storage architecture. Experiments on simulation and real datasets demonstrate that it has two distinct advantages. First, modulation encoding provides a simple way to ensure the encoded DNA sequences comply with biological sequence constraints (i.e., GC balanced and no homopolymers); Second, modulation decoding is highly efficient and extremely robust for the detection of insertions and deletions, which can correct up to ~40% errors. These two advantages pave the way for future high-throughput and low-cost techniques, and will kickstart the actualization of a viable, large-scale system for DNA data storage. Competing Interest Statement The authors have declared no competing interest. Footnotes * update more comparision results.
DOI:10.1101/2022.05.25.490755