A complete reference genome improves analysis of human genetic variation

Compared to its predecessors, the Telomere-to-Telomere CHM13 genome adds nearly 200 million base pairs of sequence, corrects thousands of structural errors, and unlocks the most complex regions of the human genome for clinical and functional study. We show how this reference universally improves rea...

Full description

Saved in:
Bibliographic Details
Published inScience (American Association for the Advancement of Science) Vol. 376; no. 6588; p. eabl3533
Main Authors Aganezov, Sergey, Yan, Stephanie M., Soto, Daniela C., Kirsche, Melanie, Zarate, Samantha, Avdeyev, Pavel, Taylor, Dylan J., Shafin, Kishwar, Shumate, Alaina, Xiao, Chunlin, Wagner, Justin, McDaniel, Jennifer, Olson, Nathan D., Sauria, Michael E. G., Vollger, Mitchell R., Rhie, Arang, Meredith, Melissa, Martin, Skylar, Lee, Joyce, Koren, Sergey, Rosenfeld, Jeffrey A., Paten, Benedict, Layer, Ryan, Chin, Chen-Shan, Sedlazeck, Fritz J., Hansen, Nancy F., Miller, Danny E., Phillippy, Adam M., Miga, Karen H., McCoy, Rajiv C., Dennis, Megan Y., Zook, Justin M., Schatz, Michael C.
Format Journal Article
LanguageEnglish
Published United States The American Association for the Advancement of Science 01.04.2022
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:Compared to its predecessors, the Telomere-to-Telomere CHM13 genome adds nearly 200 million base pairs of sequence, corrects thousands of structural errors, and unlocks the most complex regions of the human genome for clinical and functional study. We show how this reference universally improves read mapping and variant calling for 3202 and 17 globally diverse samples sequenced with short and long reads, respectively. We identify hundreds of thousands of variants per sample in previously unresolved regions, showcasing the promise of the T2T-CHM13 reference for evolutionary and biomedical discovery. Simultaneously, this reference eliminates tens of thousands of spurious variants per sample, including reduction of false positives in 269 medically relevant genes by up to a factor of 12. Because of these improvements in variant discovery coupled with population and functional genomic resources, T2T-CHM13 is positioned to replace GRCh38 as the prevailing reference for human genetics.
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
content type line 23
Author contributions: Annotation and comparison of GRCh38 and T2T-CHM13, A.S., A.R., D.C.S., D.E.M., D.J.T., S.M.Y., M.E.G.S., M.R.V., N.F.H., M.Y.D., J.M.Z., R.C.M.; short-read analysis, S.Z., S.A., S.M.Y., R.C.M., M.C.S.; long-read analysis, K.S., M.K., S.A., J.Z., M.C.S.; population genetics, S.A., D.J.T., S.M.Y., M.C.S., R.C.M.; clinical variant analysis, D.C.S., D.E.M., M.Y.D., J.M.Z.; variant analysis, A.M.P., B.P., C.S.C., C.X., D.C.S., F.J.S., J.L., J.M., J.M.Z., J.W., M.M., M.C.S., M.Y.D., N.D.O., P.A., R.C.M., R.L., S.A., S.K., S.M.; project design, A.M.P., B.P., F.J.S., J.A.R., J.M.Z., K.H.M., M.Y.D., R.C.M., M.C.S., R.L.; writing, R.C.M., M.Y.D., J.M.Z., and M.C.S. with input from all of the authors.
These authors contributed equally to this work.
ISSN:0036-8075
1095-9203
1095-9203
DOI:10.1126/science.abl3533