Pan-conserved segment tags identify ultra-conserved sequences across assemblies in the human pangenome

The human pangenome, a new reference sequence, addresses many limitations of the current GRCh38 reference. The first release is based on 94 high-quality haploid assemblies from individuals with diverse backgrounds. We employed a k-mer indexing strategy for comparative analysis across multiple assemb...

Full description

Saved in:
Bibliographic Details
Published inCell reports methods Vol. 3; no. 8; p. 100543
Main Authors Lee, HoJoon, Greer, Stephanie U., Pavlichin, Dmitri S., Zhou, Bo, Urban, Alexander E., Weissman, Tsachy, Ji, Hanlee P., Liao, Wen-Wei, Asri, Mobin, Ebler, Jana, Doerr, Daniel, Haukness, Marina, Hickey, Glenn, Lu, Shuangjia, Lucas, Julian K., Monlong, Jean, Abel, Haley J., Buonaiuto, Silvia, Chang, Xian H., Cheng, Haoyu, Chu, Justin, Colonna, Vincenza, Eizenga, Jordan M., Feng, Xiaowen, Fischer, Christian, Fulton, Robert S., Garg, Shilpa, Groza, Cristian, Guarracino, Andrea, Harvey, William T., Heumos, Simon, Howe, Kerstin, Jain, Miten, Lu, Tsung-Yu, Markello, Charles, Martin, Fergal J., Mitchell, Matthew W., Munson, Katherine M., Mwaniki, Moses Njagi, Novak, Adam M., Olsen, Hugh E., Pesout, Trevor, Porubsky, David, Prins, Pjotr, Sibbesen, Jonas A., Tomlinson, Chad, Villani, Flavia, Vollger, Mitchell R., Antonacci-Fulton, Lucinda L., Baid, Gunjan, Baker, Carl A., Belyaeva, Anastasiya, Billis, Konstantinos, Carroll, Andrew, Chang, Pi-Chuan, Cody, Sarah, Cook, Daniel E., Cornejo, Omar E., Diekhans, Mark, Ebert, Peter, Fairley, Susan, Fedrigo, Olivier, Felsenfeld, Adam L., Formenti, Giulio, Frankish, Adam, Gao, Yan, Giron, Carlos Garcia, Green, Richard E., Haggerty, Leanne, Hoekzema, Kendra, Hourlier, Thibaut, Kolesnikov, Alexey, Korbel, Jan O., Kordosky, Jennifer, Lewis, Alexandra P., Magalhães, Hugo, Marco-Sola, Santiago, Marijon, Pierre, McDaniel, Jennifer, Mountcastle, Jacquelyn, Nattestad, Maria, Olson, Nathan D., Puiu, Daniela, Regier, Allison A., Rhie, Arang, Sacco, Samuel, Sanders, Ashley D., Schneider, Valerie A., Schultz, Baergen I., Shafin, Kishwar, Sirén, Jouni, Smith, Michael W., Sofia, Heidi J., Abou Tayoun, Ahmad N., Thibaud-Nissen, Françoise, Tricomi, Francesca Floriana, Wagner, Justin, Wood, Jonathan M.D., Zimin, Aleksey V.
Format Journal Article
LanguageEnglish
Published United States Elsevier 28.08.2023
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:The human pangenome, a new reference sequence, addresses many limitations of the current GRCh38 reference. The first release is based on 94 high-quality haploid assemblies from individuals with diverse backgrounds. We employed a k-mer indexing strategy for comparative analysis across multiple assemblies, including the pangenome reference, GRCh38, and CHM13, a telomere-to-telomere reference assembly. Our k-mer indexing approach enabled us to identify a valuable collection of universally conserved sequences across all assemblies, referred to as "pan-conserved segment tags" (PSTs). By examining intervals between these segments, we discerned highly conserved genomic segments and those with structurally related polymorphisms. We found 60,764 polymorphic intervals with unique geo-ethnic features in the pangenome reference. In this study, we utilized ultra-conserved sequences (PSTs) to forge a link between human pangenome assemblies and reference genomes. This methodology enables the examination of any sequence of interest within the pangenome, using the reference genome as a comparative framework.
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 23
Lead contact
ISSN:2667-2375
2667-2375
DOI:10.1016/j.crmeth.2023.100543