Gaps and complex structurally variant loci in phased genome assemblies
There has been tremendous progress in phased genome assembly production by combining long-read data with parental information or linked-read data. Nevertheless, a typical phased genome assembly generated by trio-hifiasm still generates more than 140 gaps. We perform a detailed analysis of gaps, asse...
Saved in:
Published in | Genome research Vol. 33; no. 4; pp. 496 - 510 |
---|---|
Main Authors | , , , , , , , , , , , , |
Format | Journal Article |
Language | English |
Published |
United States
Cold Spring Harbor Laboratory Press
01.04.2023
|
Subjects | |
Online Access | Get full text |
Cover
Loading…
Abstract | There has been tremendous progress in phased genome assembly production by combining long-read data with parental information or linked-read data. Nevertheless, a typical phased genome assembly generated by trio-hifiasm still generates more than 140 gaps. We perform a detailed analysis of gaps, assembly breaks, and misorientations from 182 haploid assemblies obtained from a diversity panel of 77 unique human samples. Although trio-based approaches using HiFi are the current gold standard, chromosome-wide phasing accuracy is comparable when using Strand-seq instead of parental data. Importantly, the majority of assembly gaps cluster near the largest and most identical repeats (including segmental duplications [35.4%], satellite DNA [22.3%], or regions enriched in GA/AT-rich DNA [27.4%]). Consequently, 1513 protein-coding genes overlap assembly gaps in at least one haplotype, and 231 are recurrently disrupted or missing from five or more haplotypes. Furthermore, we estimate that 6–7 Mbp of DNA are misorientated per haplotype irrespective of whether trio-free or trio-based approaches are used. Of these misorientations, 81% correspond to bona fide large inversion polymorphisms in the human species, most of which are flanked by large segmental duplications. We also identify large-scale alignment discontinuities consistent with 11.9 Mbp of deletions and 161.4 Mbp of insertions per haploid genome. Although 99% of this variation corresponds to satellite DNA, we identify 230 regions of euchromatic DNA with frequent expansions and contractions, nearly half of which overlap with 197 protein-coding genes. Such variable and incompletely assembled regions are important targets for future algorithmic development and pangenome representation. |
---|---|
AbstractList | There has been tremendous progress in phased genome assembly production by combining long-read data with parental information or linked-read data. Nevertheless, a typical phased genome assembly generated by trio-hifiasm still generates more than 140 gaps. We perform a detailed analysis of gaps, assembly breaks, and misorientations from 182 haploid assemblies obtained from a diversity panel of 77 unique human samples. Although trio-based approaches using HiFi are the current gold standard, chromosome-wide phasing accuracy is comparable when using Strand-seq instead of parental data. Importantly, the majority of assembly gaps cluster near the largest and most identical repeats (including segmental duplications [35.4%], satellite DNA [22.3%], or regions enriched in GA/AT-rich DNA [27.4%]). Consequently, 1513 protein-coding genes overlap assembly gaps in at least one haplotype, and 231 are recurrently disrupted or missing from five or more haplotypes. Furthermore, we estimate that 6–7 Mbp of DNA are misorientated per haplotype irrespective of whether trio-free or trio-based approaches are used. Of these misorientations, 81% correspond to bona fide large inversion polymorphisms in the human species, most of which are flanked by large segmental duplications. We also identify large-scale alignment discontinuities consistent with 11.9 Mbp of deletions and 161.4 Mbp of insertions per haploid genome. Although 99% of this variation corresponds to satellite DNA, we identify 230 regions of euchromatic DNA with frequent expansions and contractions, nearly half of which overlap with 197 protein-coding genes. Such variable and incompletely assembled regions are important targets for future algorithmic development and pangenome representation. There has been tremendous progress in phased genome assembly production by combining long-read data with parental information or linked-read data. Nevertheless, a typical phased genome assembly generated by trio-hifiasm still generates more than 140 gaps. We perform a detailed analysis of gaps, assembly breaks, and misorientations from 182 haploid assemblies obtained from a diversity panel of 77 unique human samples. Although trio-based approaches using HiFi are the current gold standard, chromosome-wide phasing accuracy is comparable when using Strand-seq instead of parental data. Importantly, the majority of assembly gaps cluster near the largest and most identical repeats (including segmental duplications [35.4%], satellite DNA [22.3%], or regions enriched in GA/AT-rich DNA [27.4%]). Consequently, 1513 protein-coding genes overlap assembly gaps in at least one haplotype, and 231 are recurrently disrupted or missing from five or more haplotypes. Furthermore, we estimate that 6-7 Mbp of DNA are misorientated per haplotype irrespective of whether trio-free or trio-based approaches are used. Of these misorientations, 81% correspond to bona fide large inversion polymorphisms in the human species, most of which are flanked by large segmental duplications. We also identify large-scale alignment discontinuities consistent with 11.9 Mbp of deletions and 161.4 Mbp of insertions per haploid genome. Although 99% of this variation corresponds to satellite DNA, we identify 230 regions of euchromatic DNA with frequent expansions and contractions, nearly half of which overlap with 197 protein-coding genes. Such variable and incompletely assembled regions are important targets for future algorithmic development and pangenome representation.There has been tremendous progress in phased genome assembly production by combining long-read data with parental information or linked-read data. Nevertheless, a typical phased genome assembly generated by trio-hifiasm still generates more than 140 gaps. We perform a detailed analysis of gaps, assembly breaks, and misorientations from 182 haploid assemblies obtained from a diversity panel of 77 unique human samples. Although trio-based approaches using HiFi are the current gold standard, chromosome-wide phasing accuracy is comparable when using Strand-seq instead of parental data. Importantly, the majority of assembly gaps cluster near the largest and most identical repeats (including segmental duplications [35.4%], satellite DNA [22.3%], or regions enriched in GA/AT-rich DNA [27.4%]). Consequently, 1513 protein-coding genes overlap assembly gaps in at least one haplotype, and 231 are recurrently disrupted or missing from five or more haplotypes. Furthermore, we estimate that 6-7 Mbp of DNA are misorientated per haplotype irrespective of whether trio-free or trio-based approaches are used. Of these misorientations, 81% correspond to bona fide large inversion polymorphisms in the human species, most of which are flanked by large segmental duplications. We also identify large-scale alignment discontinuities consistent with 11.9 Mbp of deletions and 161.4 Mbp of insertions per haploid genome. Although 99% of this variation corresponds to satellite DNA, we identify 230 regions of euchromatic DNA with frequent expansions and contractions, nearly half of which overlap with 197 protein-coding genes. Such variable and incompletely assembled regions are important targets for future algorithmic development and pangenome representation. |
Author | Korbel, Jan O. Ebert, Peter Stober, Catherine Marschall, Tobias Paten, Benedict Hasenfeld, Patrick Eichler, Evan E. Porubsky, David Vollger, Mitchell R. Rozanski, Allison N. Harvey, William T. Sanders, Ashley D. Hickey, Glenn |
Author_xml | – sequence: 1 givenname: David orcidid: 0000-0001-8414-8966 surname: Porubsky fullname: Porubsky, David – sequence: 2 givenname: Mitchell R. orcidid: 0000-0002-8651-1615 surname: Vollger fullname: Vollger, Mitchell R. – sequence: 3 givenname: William T. orcidid: 0000-0003-0646-7528 surname: Harvey fullname: Harvey, William T. – sequence: 4 givenname: Allison N. orcidid: 0000-0002-5034-1773 surname: Rozanski fullname: Rozanski, Allison N. – sequence: 5 givenname: Peter orcidid: 0000-0001-7441-532X surname: Ebert fullname: Ebert, Peter – sequence: 6 givenname: Glenn orcidid: 0000-0002-2280-9404 surname: Hickey fullname: Hickey, Glenn – sequence: 7 givenname: Patrick orcidid: 0000-0003-2319-2482 surname: Hasenfeld fullname: Hasenfeld, Patrick – sequence: 8 givenname: Ashley D. orcidid: 0000-0003-3945-0677 surname: Sanders fullname: Sanders, Ashley D. – sequence: 9 givenname: Catherine orcidid: 0000-0002-9481-013X surname: Stober fullname: Stober, Catherine – sequence: 10 givenname: Jan O. orcidid: 0000-0002-2798-3794 surname: Korbel fullname: Korbel, Jan O. – sequence: 11 givenname: Benedict orcidid: 0000-0001-8863-3539 surname: Paten fullname: Paten, Benedict – sequence: 12 givenname: Tobias orcidid: 0000-0002-9376-1030 surname: Marschall fullname: Marschall, Tobias – sequence: 13 givenname: Evan E. orcidid: 0000-0002-8246-4014 surname: Eichler fullname: Eichler, Evan E. |
BackLink | https://www.ncbi.nlm.nih.gov/pubmed/37164484$$D View this record in MEDLINE/PubMed |
BookMark | eNp1kc9LHTEQx0NRqr722GsJeOlln_m1SfYkItUWBC_2HLLZec9INlmTXdH_vnk8K63Q0wzMZ758Z74n6CCmCAh9oWRNKaFn27xmSnEu1pSxD-iYtqJrWiG7g9oTrZuOtPQInZTyQAjhQuuP6IgrKoXQ4hhdXdupYBsH7NI4BXjGZc6Lm5dsQ3jBTzZ7G2cckvPYRzzd2wID3kJMI2BbCox98FA-ocONDQU-v9YV-nX1_e7yR3Nze_3z8uKmcYLKuVFOWkZlr5hTgksHnAy2Vc52Q7_Rmg8MaE_BSik2RDDZOzEIRYZOgHU7fIXO97rT0o8wOIhzNWqm7EebX0yy3vw7if7ebNOToYRxwbquKnx7VcjpcYEym9EXByHYCGkphmnKWqJV_dUKnb5DH9KSY71vRwnSEtW1lfr6t6U3L39-XIFmD7icSsmweUMoMbsMzTabfYamZlh5_o53frazT7uLfPjP1m8UW5__ |
CitedBy_id | crossref_primary_10_1101_gr_277175_122 crossref_primary_10_1038_s41588_024_02051_8 crossref_primary_10_1186_s13059_023_02995_w crossref_primary_10_1038_s41576_024_00718_w crossref_primary_10_1042_ETLS20230074 crossref_primary_10_1038_s41592_024_02269_8 crossref_primary_10_1038_s41586_023_05895_y crossref_primary_10_1038_s41586_023_05896_x crossref_primary_10_17816_fm16167 crossref_primary_10_1038_s41467_025_57505_2 crossref_primary_10_1016_j_scib_2023_06_014 crossref_primary_10_1186_s13023_024_03307_6 crossref_primary_10_1016_j_gde_2024_102233 crossref_primary_10_1186_s13059_023_02919_8 crossref_primary_10_1016_j_cell_2024_01_002 crossref_primary_10_1038_s41435_024_00279_2 crossref_primary_10_1101_gr_279346_124 crossref_primary_10_1016_j_cell_2024_01_052 |
Cites_doi | 10.1126/science.abf7117 10.1038/s41588-022-01043-w 10.1093/bioinformatics/bty191 10.1038/s41586-023-05896-x 10.1101/gr.263566.120 10.1038/nature15393 10.1038/s41587-023-01662-6 10.1111/ahg.12364 10.1038/ng.3092 10.1093/bioinformatics/btp352 10.1038/nmeth0810-576 10.1038/s41592-020-01056-5 10.1126/science.abl4178 10.1093/bioinformatics/btp698 10.1038/nbt.4235 10.1038/nprot.2017.029 10.1126/science.abj6987 10.1089/cmb.2014.0157 10.1038/s41586-022-04601-8 10.1101/gr.209841.116 10.1038/s41587-019-0217-9 10.1093/nar/30.11.2478 10.1038/s41467-018-08148-z 10.1016/j.cell.2022.08.004 10.1016/j.gpb.2016.05.004 10.1038/s41587-019-0366-x 10.1016/j.ajhg.2022.02.014 10.1038/s41587-019-0072-8 10.1016/j.cell.2022.04.017 10.1126/science.1197005 10.1101/2023.04.05.535718 10.1038/s41587-020-0503-6 10.1038/ng.909 10.1038/s41586-023-05976-y 10.1126/science.abj6965 10.1038/s41586-023-05895-y 10.1038/s41586-022-05325-5 10.1038/s41587-020-0711-0 10.1038/ng1862 10.1038/nature15394 10.1101/705616 10.1038/s41587-020-0719-5 10.1038/nmeth.2206 10.1038/s41586-021-03420-7 10.1186/s12915-018-0535-2 10.1038/s41587-022-01261-x 10.1093/bioinformatics/btv098 |
ContentType | Journal Article |
Contributor | Ebler, Jana Prins, Pjotr Green, Richard E Martin, Fergal J Billis, Konstantinos Mountcastle, Jacquelyn Fairley, Susan Frankish, Adam Lu, Tsung-Yu Markello, Charles Mwaniki, Moses Njagi Guarracino, Andrea Baker, Carl A Jarvis, Erich D Monlong, Jean Giron, Carlos Garcia Pesout, Trevor Cornejo, Omar E Gao, Yan Paten, Benedict Colonna, Vincenza Rautiainen, Mikko Flicek, Paul Rhie, Arang Nurk, Sergey Chaisson, Mark J P Ji, Hanlee P Doerr, Daniel Kolesnikov, Alexey Olsen, Hugh E Harvey, William T Chang, Pi-Chuan Belyaeva, Anastasiya Garg, Shilpa Magalhães, Hugo Cook, Daniel E Groza, Cristian Hoekzema, Kendra Marco-Sola, Santiago Asri, Mobin Chu, Justin Lu, Shuangjia Munson, Katherine M Cheng, Haoyu Korbel, Jan O Lee, HoJoon Cody, Sarah Chang, Xian H Ebert, Peter Haussler, David Olson, Nathan D Marijon, Pierre Garrison, Nanibaa' A McDaniel, Jennifer Fedrigo, Olivier Hall, Ira M Fischer, Christian Fulton, Robert S Haukness, Marina Kordosky, Jennifer Bourque, Guillaume Carroll, Andrew Regier, Allison A Koren, Sergey Garrison, Erik Mitchell, Matthew W Nattesta |
Contributor_xml | – sequence: 1 givenname: Haley J surname: Abel fullname: Abel, Haley J – sequence: 2 givenname: Lucinda L surname: Antonacci-Fulton fullname: Antonacci-Fulton, Lucinda L – sequence: 3 givenname: Mobin surname: Asri fullname: Asri, Mobin – sequence: 4 givenname: Gunjan surname: Baid fullname: Baid, Gunjan – sequence: 5 givenname: Carl A surname: Baker fullname: Baker, Carl A – sequence: 6 givenname: Anastasiya surname: Belyaeva fullname: Belyaeva, Anastasiya – sequence: 7 givenname: Konstantinos surname: Billis fullname: Billis, Konstantinos – sequence: 8 givenname: Guillaume surname: Bourque fullname: Bourque, Guillaume – sequence: 9 givenname: Silvia surname: Buonaiuto fullname: Buonaiuto, Silvia – sequence: 10 givenname: Andrew surname: Carroll fullname: Carroll, Andrew – sequence: 11 givenname: Mark J P surname: Chaisson fullname: Chaisson, Mark J P – sequence: 12 givenname: Pi-Chuan surname: Chang fullname: Chang, Pi-Chuan – sequence: 13 givenname: Xian H surname: Chang fullname: Chang, Xian H – sequence: 14 givenname: Haoyu surname: Cheng fullname: Cheng, Haoyu – sequence: 15 givenname: Justin surname: Chu fullname: Chu, Justin – sequence: 16 givenname: Sarah surname: Cody fullname: Cody, Sarah – sequence: 17 givenname: Vincenza surname: Colonna fullname: Colonna, Vincenza – sequence: 18 givenname: Daniel E surname: Cook fullname: Cook, Daniel E – sequence: 19 givenname: Robert M surname: Cook-Deegan fullname: Cook-Deegan, Robert M – sequence: 20 givenname: Omar E surname: Cornejo fullname: Cornejo, Omar E – sequence: 21 givenname: Mark surname: Diekhans fullname: Diekhans, Mark – sequence: 22 givenname: Daniel surname: Doerr fullname: Doerr, Daniel – sequence: 23 givenname: Peter surname: Ebert fullname: Ebert, Peter – sequence: 24 givenname: Jana surname: Ebler fullname: Ebler, Jana – sequence: 25 givenname: Evan E surname: Eichler fullname: Eichler, Evan E – sequence: 26 givenname: Jordan M surname: Eizenga fullname: Eizenga, Jordan M – sequence: 27 givenname: Susan surname: Fairley fullname: Fairley, Susan – sequence: 28 givenname: Olivier surname: Fedrigo fullname: Fedrigo, Olivier – sequence: 29 givenname: Adam L surname: Felsenfeld fullname: Felsenfeld, Adam L – sequence: 30 givenname: Xiaowen surname: Feng fullname: Feng, Xiaowen – sequence: 31 givenname: Christian surname: Fischer fullname: Fischer, Christian – sequence: 32 givenname: Paul surname: Flicek fullname: Flicek, Paul – sequence: 33 givenname: Giulio surname: Formenti fullname: Formenti, Giulio – sequence: 34 givenname: Adam surname: Frankish fullname: Frankish, Adam – sequence: 35 givenname: Robert S surname: Fulton fullname: Fulton, Robert S – sequence: 36 givenname: Yan surname: Gao fullname: Gao, Yan – sequence: 37 givenname: Shilpa surname: Garg fullname: Garg, Shilpa – sequence: 38 givenname: Erik surname: Garrison fullname: Garrison, Erik – sequence: 39 givenname: Nanibaa' A surname: Garrison fullname: Garrison, Nanibaa' A – sequence: 40 givenname: Carlos Garcia surname: Giron fullname: Giron, Carlos Garcia – sequence: 41 givenname: Richard E surname: Green fullname: Green, Richard E – sequence: 42 givenname: Cristian surname: Groza fullname: Groza, Cristian – sequence: 43 givenname: Andrea surname: Guarracino fullname: Guarracino, Andrea – sequence: 44 givenname: Leanne surname: Haggerty fullname: Haggerty, Leanne – sequence: 45 givenname: Ira M surname: Hall fullname: Hall, Ira M – sequence: 46 givenname: William T surname: Harvey fullname: Harvey, William T – sequence: 47 givenname: Marina surname: Haukness fullname: Haukness, Marina – sequence: 48 givenname: David surname: Haussler fullname: Haussler, David – sequence: 49 givenname: Simon surname: Heumos fullname: Heumos, Simon – sequence: 50 givenname: Glenn surname: Hickey fullname: Hickey, Glenn – sequence: 51 givenname: Kendra surname: Hoekzema fullname: Hoekzema, Kendra – sequence: 52 givenname: Thibaut surname: Hourlier fullname: Hourlier, Thibaut – sequence: 53 givenname: Kerstin surname: Howe fullname: Howe, Kerstin – sequence: 54 givenname: Miten surname: Jain fullname: Jain, Miten – sequence: 55 givenname: Erich D surname: Jarvis fullname: Jarvis, Erich D – sequence: 56 givenname: Hanlee P surname: Ji fullname: Ji, Hanlee P – sequence: 57 givenname: Eimear E surname: Kenny fullname: Kenny, Eimear E – sequence: 58 givenname: Barbara A surname: Koenig fullname: Koenig, Barbara A – sequence: 59 givenname: Alexey surname: Kolesnikov fullname: Kolesnikov, Alexey – sequence: 60 givenname: Jan O surname: Korbel fullname: Korbel, Jan O – sequence: 61 givenname: Jennifer surname: Kordosky fullname: Kordosky, Jennifer – sequence: 62 givenname: Sergey surname: Koren fullname: Koren, Sergey – sequence: 63 givenname: HoJoon surname: Lee fullname: Lee, HoJoon – sequence: 64 givenname: Alexandra P surname: Lewis fullname: Lewis, Alexandra P – sequence: 65 givenname: Heng surname: Li fullname: Li, Heng – sequence: 66 givenname: Wen-Wei surname: Liao fullname: Liao, Wen-Wei – sequence: 67 givenname: Shuangjia surname: Lu fullname: Lu, Shuangjia – sequence: 68 givenname: Tsung-Yu surname: Lu fullname: Lu, Tsung-Yu – sequence: 69 givenname: Julian K surname: Lucas fullname: Lucas, Julian K – sequence: 70 givenname: Hugo surname: Magalhães fullname: Magalhães, Hugo – sequence: 71 givenname: Santiago surname: Marco-Sola fullname: Marco-Sola, Santiago – sequence: 72 givenname: Pierre surname: Marijon fullname: Marijon, Pierre – sequence: 73 givenname: Charles surname: Markello fullname: Markello, Charles – sequence: 74 givenname: Tobias surname: Marschall fullname: Marschall, Tobias – sequence: 75 givenname: Fergal J surname: Martin fullname: Martin, Fergal J – sequence: 76 givenname: Ann surname: McCartney fullname: McCartney, Ann – sequence: 77 givenname: Jennifer surname: McDaniel fullname: McDaniel, Jennifer – sequence: 78 givenname: Karen H surname: Miga fullname: Miga, Karen H – sequence: 79 givenname: Matthew W surname: Mitchell fullname: Mitchell, Matthew W – sequence: 80 givenname: Jean surname: Monlong fullname: Monlong, Jean – sequence: 81 givenname: Jacquelyn surname: Mountcastle fullname: Mountcastle, Jacquelyn – sequence: 82 givenname: Katherine M surname: Munson fullname: Munson, Katherine M – sequence: 83 givenname: Moses Njagi surname: Mwaniki fullname: Mwaniki, Moses Njagi – sequence: 84 givenname: Maria surname: Nattestad fullname: Nattestad, Maria – sequence: 85 givenname: Adam M surname: Novak fullname: Novak, Adam M – sequence: 86 givenname: Sergey surname: Nurk fullname: Nurk, Sergey – sequence: 87 givenname: Hugh E surname: Olsen fullname: Olsen, Hugh E – sequence: 88 givenname: Nathan D surname: Olson fullname: Olson, Nathan D – sequence: 89 givenname: Benedict surname: Paten fullname: Paten, Benedict – sequence: 90 givenname: Trevor surname: Pesout fullname: Pesout, Trevor – sequence: 91 givenname: Adam M surname: Phillippy fullname: Phillippy, Adam M – sequence: 92 givenname: Alice B surname: Popejoy fullname: Popejoy, Alice B – sequence: 93 givenname: David surname: Porubsky fullname: Porubsky, David – sequence: 94 givenname: Pjotr surname: Prins fullname: Prins, Pjotr – sequence: 95 givenname: Daniela surname: Puiu fullname: Puiu, Daniela – sequence: 96 givenname: Mikko surname: Rautiainen fullname: Rautiainen, Mikko – sequence: 97 givenname: Allison A surname: Regier fullname: Regier, Allison A – sequence: 98 givenname: Arang surname: Rhie fullname: Rhie, Arang – sequence: 99 givenname: Samuel surname: Sacco fullname: Sacco, Samuel – sequence: 100 givenname: Ashley D surname: Sanders fullname: Sanders, Ashley D |
Copyright | 2023 Porubsky et al.; Published by Cold Spring Harbor Laboratory Press. Copyright Cold Spring Harbor Laboratory Press Apr 2023 2023 |
Copyright_xml | – notice: 2023 Porubsky et al.; Published by Cold Spring Harbor Laboratory Press. – notice: Copyright Cold Spring Harbor Laboratory Press Apr 2023 – notice: 2023 |
CorporateAuthor | Human Pangenome Reference Consortium |
CorporateAuthor_xml | – name: Human Pangenome Reference Consortium |
DBID | AAYXX CITATION CGR CUY CVF ECM EIF NPM 7TM 8FD FR3 P64 RC3 7X8 5PM |
DOI | 10.1101/gr.277334.122 |
DatabaseName | CrossRef Medline MEDLINE MEDLINE (Ovid) MEDLINE MEDLINE PubMed Nucleic Acids Abstracts Technology Research Database Engineering Research Database Biotechnology and BioEngineering Abstracts Genetics Abstracts MEDLINE - Academic PubMed Central (Full Participant titles) |
DatabaseTitle | CrossRef MEDLINE Medline Complete MEDLINE with Full Text PubMed MEDLINE (Ovid) Genetics Abstracts Engineering Research Database Technology Research Database Nucleic Acids Abstracts Biotechnology and BioEngineering Abstracts MEDLINE - Academic |
DatabaseTitleList | MEDLINE MEDLINE - Academic CrossRef Genetics Abstracts |
Database_xml | – sequence: 1 dbid: NPM name: PubMed url: https://proxy.k.utb.cz/login?url=http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=PubMed sourceTypes: Index Database – sequence: 2 dbid: EIF name: MEDLINE url: https://proxy.k.utb.cz/login?url=https://www.webofscience.com/wos/medline/basic-search sourceTypes: Index Database |
DeliveryMethod | fulltext_linktorsrc |
Discipline | Anatomy & Physiology Chemistry Biology |
DocumentTitleAlternate | Porubsky et al |
EISSN | 1549-5469 |
EndPage | 510 |
ExternalDocumentID | PMC10234299 37164484 10_1101_gr_277334_122 |
Genre | Research Support, Non-U.S. Gov't Journal Article Research Support, N.I.H., Extramural |
GrantInformation_xml | – fundername: NHGRI NIH HHS grantid: R01 HG002385 – fundername: NHGRI NIH HHS grantid: U01 HG010971 – fundername: NHGRI NIH HHS grantid: U24 HG010262 – fundername: NHGRI NIH HHS grantid: R01 HG010485 – fundername: NHGRI NIH HHS grantid: U01 HG010963 – fundername: ; – fundername: Marie Sklodowska-Curie grantid: 956229 – fundername: ; grantid: 5R01HG002385; 5U01HG010971; 1U01HG010973 – fundername: European Union's Horizon 2020 research and innovation programme |
GroupedDBID | --- .GJ 18M 29H 2WC 39C 4.4 53G 5GY 5RE 5VS AAFWJ AAYOK AAYXX AAZTW ABDIX ABDNZ ACGFO ACLKE ACYGS ADBBV ADNWM AEILP AENEX AHPUY AI. ALMA_UNASSIGNED_HOLDINGS BAWUL BTFSW C1A CITATION CS3 DIK DU5 E3Z EBS EJD F5P FRP GX1 H13 HYE IH2 K-O KQ8 MV1 R.V RCX RHI RNS RPM RXW SJN TAE TR2 VH1 W8F WOQ YKV ZCG ZGI ZXP CGR CUY CVF ECM EIF NPM 7TM 8FD FR3 P64 RC3 7X8 5PM |
ID | FETCH-LOGICAL-c416t-7c6a216b72c7436ce30da57ca9dbf883d2e1b1ea664f0426bc4d470d94eac6ce3 |
ISSN | 1088-9051 1549-5469 |
IngestDate | Thu Aug 21 18:36:58 EDT 2025 Thu Jul 10 17:20:32 EDT 2025 Sun Jun 29 13:13:04 EDT 2025 Sun Jul 20 01:30:20 EDT 2025 Thu Apr 24 23:01:21 EDT 2025 Tue Jul 01 02:20:47 EDT 2025 |
IsDoiOpenAccess | true |
IsOpenAccess | true |
IsPeerReviewed | true |
IsScholarly | true |
Issue | 4 |
Language | English |
License | 2023 Porubsky et al.; Published by Cold Spring Harbor Laboratory Press. This article, published in Genome Research, is available under a Creative Commons License (Attribution 4.0 International), as described at http://creativecommons.org/licenses/by/4.0/. |
LinkModel | OpenURL |
MergedId | FETCHMERGED-LOGICAL-c416t-7c6a216b72c7436ce30da57ca9dbf883d2e1b1ea664f0426bc4d470d94eac6ce3 |
Notes | ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14 content type line 23 A complete list of contributing Consortium members appears at the end of this paper. |
ORCID | 0000-0003-0646-7528 0000-0002-9376-1030 0000-0002-8246-4014 0000-0002-9481-013X 0000-0001-7441-532X 0000-0002-2280-9404 0000-0003-2319-2482 0000-0001-8863-3539 0000-0003-3945-0677 0000-0002-2798-3794 0000-0001-8414-8966 0000-0002-8651-1615 0000-0002-5034-1773 |
OpenAccessLink | https://pubmed.ncbi.nlm.nih.gov/PMC10234299 |
PMID | 37164484 |
PQID | 2814050795 |
PQPubID | 2049132 |
PageCount | 15 |
ParticipantIDs | pubmedcentral_primary_oai_pubmedcentral_nih_gov_10234299 proquest_miscellaneous_2812508700 proquest_journals_2814050795 pubmed_primary_37164484 crossref_primary_10_1101_gr_277334_122 crossref_citationtrail_10_1101_gr_277334_122 |
ProviderPackageCode | CITATION AAYXX |
PublicationCentury | 2000 |
PublicationDate | 2023-04-00 20230401 |
PublicationDateYYYYMMDD | 2023-04-01 |
PublicationDate_xml | – month: 04 year: 2023 text: 2023-04-00 |
PublicationDecade | 2020 |
PublicationPlace | United States |
PublicationPlace_xml | – name: United States – name: New York |
PublicationTitle | Genome research |
PublicationTitleAlternate | Genome Res |
PublicationYear | 2023 |
Publisher | Cold Spring Harbor Laboratory Press |
Publisher_xml | – name: Cold Spring Harbor Laboratory Press |
References | 2023060612100680000_33.4.496.5 2023060612100680000_33.4.496.6 2023060612100680000_33.4.496.7 2023060612100680000_33.4.496.8 2023060612100680000_33.4.496.9 2023060612100680000_33.4.496.41 2023060612100680000_33.4.496.20 2023060612100680000_33.4.496.42 2023060612100680000_33.4.496.40 2023060612100680000_33.4.496.1 2023060612100680000_33.4.496.23 2023060612100680000_33.4.496.45 2023060612100680000_33.4.496.2 2023060612100680000_33.4.496.24 2023060612100680000_33.4.496.46 2023060612100680000_33.4.496.3 2023060612100680000_33.4.496.21 2023060612100680000_33.4.496.43 2023060612100680000_33.4.496.4 2023060612100680000_33.4.496.22 2023060612100680000_33.4.496.44 2023060612100680000_33.4.496.27 2023060612100680000_33.4.496.28 2023060612100680000_33.4.496.25 2023060612100680000_33.4.496.47 2023060612100680000_33.4.496.26 2023060612100680000_33.4.496.48 2023060612100680000_33.4.496.29 2023060612100680000_33.4.496.30 2023060612100680000_33.4.496.31 2023060612100680000_33.4.496.12 2023060612100680000_33.4.496.34 2023060612100680000_33.4.496.13 2023060612100680000_33.4.496.35 2023060612100680000_33.4.496.10 2023060612100680000_33.4.496.32 2023060612100680000_33.4.496.11 2023060612100680000_33.4.496.33 2023060612100680000_33.4.496.16 2023060612100680000_33.4.496.38 2023060612100680000_33.4.496.17 2023060612100680000_33.4.496.39 2023060612100680000_33.4.496.14 2023060612100680000_33.4.496.36 2023060612100680000_33.4.496.15 2023060612100680000_33.4.496.37 2023060612100680000_33.4.496.18 2023060612100680000_33.4.496.19 |
References_xml | – ident: 2023060612100680000_33.4.496.11 doi: 10.1126/science.abf7117 – ident: 2023060612100680000_33.4.496.12 doi: 10.1038/s41588-022-01043-w – ident: 2023060612100680000_33.4.496.20 doi: 10.1093/bioinformatics/bty191 – ident: 2023060612100680000_33.4.496.23 doi: 10.1038/s41586-023-05896-x – ident: 2023060612100680000_33.4.496.27 doi: 10.1101/gr.263566.120 – ident: 2023060612100680000_33.4.496.1 doi: 10.1038/nature15393 – ident: 2023060612100680000_33.4.496.35 doi: 10.1038/s41587-023-01662-6 – ident: 2023060612100680000_33.4.496.44 doi: 10.1111/ahg.12364 – ident: 2023060612100680000_33.4.496.8 doi: 10.1038/ng.3092 – ident: 2023060612100680000_33.4.496.22 doi: 10.1093/bioinformatics/btp352 – ident: 2023060612100680000_33.4.496.17 doi: 10.1038/nmeth0810-576 – ident: 2023060612100680000_33.4.496.5 doi: 10.1038/s41592-020-01056-5 – ident: 2023060612100680000_33.4.496.2 doi: 10.1126/science.abl4178 – ident: 2023060612100680000_33.4.496.21 doi: 10.1093/bioinformatics/btp698 – ident: 2023060612100680000_33.4.496.31 doi: 10.1038/nbt.4235 – ident: 2023060612100680000_33.4.496.37 doi: 10.1038/nprot.2017.029 – ident: 2023060612100680000_33.4.496.28 doi: 10.1126/science.abj6987 – ident: 2023060612100680000_33.4.496.29 doi: 10.1089/cmb.2014.0157 – ident: 2023060612100680000_33.4.496.47 doi: 10.1038/s41586-022-04601-8 – ident: 2023060612100680000_33.4.496.32 doi: 10.1101/gr.209841.116 – ident: 2023060612100680000_33.4.496.48 doi: 10.1038/s41587-019-0217-9 – ident: 2023060612100680000_33.4.496.10 doi: 10.1093/nar/30.11.2478 – ident: 2023060612100680000_33.4.496.4 doi: 10.1038/s41467-018-08148-z – ident: 2023060612100680000_33.4.496.3 doi: 10.1016/j.cell.2022.08.004 – ident: 2023060612100680000_33.4.496.25 doi: 10.1016/j.gpb.2016.05.004 – ident: 2023060612100680000_33.4.496.38 doi: 10.1038/s41587-019-0366-x – ident: 2023060612100680000_33.4.496.26 doi: 10.1016/j.ajhg.2022.02.014 – ident: 2023060612100680000_33.4.496.19 doi: 10.1038/s41587-019-0072-8 – ident: 2023060612100680000_33.4.496.34 doi: 10.1016/j.cell.2022.04.017 – ident: 2023060612100680000_33.4.496.41 doi: 10.1126/science.1197005 – ident: 2023060612100680000_33.4.496.15 doi: 10.1101/2023.04.05.535718 – ident: 2023060612100680000_33.4.496.39 doi: 10.1038/s41587-020-0503-6 – ident: 2023060612100680000_33.4.496.9 doi: 10.1038/ng.909 – ident: 2023060612100680000_33.4.496.16 doi: 10.1038/s41586-023-05976-y – ident: 2023060612100680000_33.4.496.45 doi: 10.1126/science.abj6965 – ident: 2023060612100680000_33.4.496.46 doi: 10.1038/s41586-023-05895-y – ident: 2023060612100680000_33.4.496.18 doi: 10.1038/s41586-022-05325-5 – ident: 2023060612100680000_33.4.496.14 doi: 10.1038/s41587-020-0711-0 – ident: 2023060612100680000_33.4.496.40 doi: 10.1038/ng1862 – ident: 2023060612100680000_33.4.496.42 doi: 10.1038/nature15394 – ident: 2023060612100680000_33.4.496.7 doi: 10.1101/705616 – ident: 2023060612100680000_33.4.496.33 doi: 10.1038/s41587-020-0719-5 – ident: 2023060612100680000_33.4.496.36 – ident: 2023060612100680000_33.4.496.13 doi: 10.1038/nmeth.2206 – ident: 2023060612100680000_33.4.496.24 doi: 10.1038/s41586-021-03420-7 – ident: 2023060612100680000_33.4.496.30 doi: 10.1186/s12915-018-0535-2 – ident: 2023060612100680000_33.4.496.6 doi: 10.1038/s41587-022-01261-x – ident: 2023060612100680000_33.4.496.43 doi: 10.1093/bioinformatics/btv098 |
SSID | ssj0003488 |
Score | 2.5488605 |
Snippet | There has been tremendous progress in phased genome assembly production by combining long-read data with parental information or linked-read data.... |
SourceID | pubmedcentral proquest pubmed crossref |
SourceType | Open Access Repository Aggregation Database Index Database Enrichment Source |
StartPage | 496 |
SubjectTerms | Chromosomes DNA, Satellite - genetics Genomes Haplotypes Humans Polymorphism, Genetic Satellite DNA Segmental Duplications, Genomic Sequence Analysis, DNA |
Title | Gaps and complex structurally variant loci in phased genome assemblies |
URI | https://www.ncbi.nlm.nih.gov/pubmed/37164484 https://www.proquest.com/docview/2814050795 https://www.proquest.com/docview/2812508700 https://pubmed.ncbi.nlm.nih.gov/PMC10234299 |
Volume | 33 |
hasFullText | 1 |
inHoldings | 1 |
isFullTextHit | |
isPrint | |
link | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwnV1Lb9QwELagCOgFQcsjUJCRUC9LljycODlwWAptAbVCqEW9RbbjbFfaTVbbFNH-esavbLalEnCxothxJH9f7Bln_A1CbxJGUxpw4pOqgoJniZ-B2epz5YqJMqO51ik4OEz3j8mXk-RkmRVVny5p-VBc_vFcyf-gCvcAV3VK9h-Q7TqFG3AN-EIJCEP5VxjvsfmZO5g2n8pfA6MGq5Q0pheDn-AGw7gNYLma6GDxU1ixSpUzuZnJARjNcsanLobQ2qd7ptJKAHVbxd-aBUwwZpu1FwV__v4H0GhsUD-YtDqqdBmCqPIOmR1xu6uzjMj-3lwylzJ7BHUqD-JhfwciinuBK9LOmiT3E2Jyrrhp1ehbWPqQ3hxJTArb63O3zhkwXgwjSuOYDENzYLmH43ymgYyVj2ckUK-KZbuq2-gO9BKplBYfP3_tluYYpqtOaDV8t_KudXTPPb1qo1xzPK7Gz_YMkqOH6IH1JPDI0OIRuiXrDbQ5qlnbzC7wNtaxvfqnyQa6-8Fd3d9xGf420a7iDwb-YMsf3OcPtvzBij94UmPDH2z4g5f8eYyOdz8d7ez7Nq-GL8D8bn0qUhaFKaeRAPsxFTIOSpZQwfKSV1kWl5EMeShZmpJKudhckJLQoMwJrNKq-RO0Vje1fIZwJkRVZbEsq6gkGawPJSUJrUQaClalEffQWzeShbCi8yr3ybTQzmcQFuNFYTAoAAMPbXfN50Zt5aaGWw6Wwn6QZ0Wk1NvAv8kTD73uqmFM1T8wVsvmXLcBox8WqcBDTw2K3Zsc_B7KVvDtGigp9tWaenKqJdmVAIqy7J7f2OkLtL78drbQGuApX4I92_JXmqO_AfsWokA |
linkProvider | Flying Publisher |
openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Gaps+and+complex+structurally+variant+loci+in+phased+genome+assemblies&rft.jtitle=Genome+research&rft.au=Porubsky%2C+David&rft.au=Vollger%2C+Mitchell+R&rft.au=Harvey%2C+William+T&rft.au=Rozanski%2C+Allison+N&rft.date=2023-04-01&rft.eissn=1549-5469&rft.volume=33&rft.issue=4&rft.spage=496&rft_id=info:doi/10.1101%2Fgr.277334.122&rft_id=info%3Apmid%2F37164484&rft.externalDocID=37164484 |
thumbnail_l | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=1088-9051&client=summon |
thumbnail_m | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=1088-9051&client=summon |
thumbnail_s | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=1088-9051&client=summon |