Deep distributed computing to reconstruct extremely large lineage trees
Phylogeny estimation (the reconstruction of evolutionary trees) has recently been applied to CRISPR-based cell lineage tracing, allowing the developmental history of an individual tissue or organism to be inferred from a large number of mutated sequences in somatic cells. However, current computatio...
Saved in:
Published in | Nature biotechnology Vol. 40; no. 4; pp. 566 - 575 |
---|---|
Main Authors | , , , , , , , , , , , |
Format | Journal Article |
Language | English |
Published |
New York
Nature Publishing Group US
01.04.2022
Nature Publishing Group |
Subjects | |
Online Access | Get full text |
Cover
Loading…
Summary: | Phylogeny estimation (the reconstruction of evolutionary trees) has recently been applied to CRISPR-based cell lineage tracing, allowing the developmental history of an individual tissue or organism to be inferred from a large number of mutated sequences in somatic cells. However, current computational methods are not able to construct phylogenetic trees from extremely large numbers of input sequences. Here, we present a deep distributed computing framework to comprehensively trace accurate large lineages (FRACTAL) that substantially enhances the scalability of current lineage estimation software tools. FRACTAL first reconstructs only an upstream lineage of the input sequences and recursively iterates the same produce for its downstream lineages using independent computing nodes. We demonstrate the utility of FRACTAL by reconstructing lineages from >235 million simulated sequences and from >16 million cells from a simulated experiment with a CRISPR system that accumulates mutations during cell proliferation. We also successfully applied FRACTAL to evolutionary tree reconstructions and to an experiment using error-prone PCR (EP-PCR) for large-scale sequence diversification.
Cell lineage tracing is scaled up to hundreds of millions of simulated sequences with distributed computing. |
---|---|
Bibliography: | ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 23 Author contributions N.K., H.M. and N.Y. conceived the high-level concept of FRACTAL. N.K., W.I. and N.Y. designed the study. N.K. implemented FRACTAL. K.W. and N.K. implemented PRESUME. N.K. led the analyses. N.M. and N.Y. designed the high-content cell lineage recording model. Y.K. and K.W. supported the analyses of the simulated cell lineages. S.I. and M.T. performed the EP-PCR experiments. Y.K. supported the analysis of the EP-PCR experiments. N.K., K.O., D.P. and T.I. performed data visualization using HiView. N.K., Y.K., S.I. and N.Y. wrote the manuscript. |
ISSN: | 1087-0156 1546-1696 |
DOI: | 10.1038/s41587-021-01111-2 |