GenoPipe: identifying the genotype of origin within (epi)genomic datasets
Abstract Confidence in experimental results is critical for discovery. As the scale of data generation in genomics has grown exponentially, experimental error has likely kept pace despite the best efforts of many laboratories. Technical mistakes can and do occur at nearly every stage of a genomics a...
Saved in:
Published in | Nucleic acids research Vol. 51; no. 22; pp. 12054 - 12068 |
---|---|
Main Authors | , , , |
Format | Journal Article |
Language | English |
Published |
England
Oxford University Press
11.12.2023
|
Subjects | |
Online Access | Get full text |
Cover
Loading…
Abstract | Abstract
Confidence in experimental results is critical for discovery. As the scale of data generation in genomics has grown exponentially, experimental error has likely kept pace despite the best efforts of many laboratories. Technical mistakes can and do occur at nearly every stage of a genomics assay (i.e. cell line contamination, reagent swapping, tube mislabelling, etc.) and are often difficult to identify post-execution. However, the DNA sequenced in genomic experiments contains certain markers (e.g. indels) encoded within and can often be ascertained forensically from experimental datasets. We developed the Genotype validation Pipeline (GenoPipe), a suite of heuristic tools that operate together directly on raw and aligned sequencing data from individual high-throughput sequencing experiments to characterize the underlying genome of the source material. We demonstrate how GenoPipe validates and rescues erroneously annotated experiments by identifying unique markers inherent to an organism's genome (i.e. epitope insertions, gene deletions and SNPs).
Graphical Abstract
Graphical Abstract |
---|---|
AbstractList | Abstract
Confidence in experimental results is critical for discovery. As the scale of data generation in genomics has grown exponentially, experimental error has likely kept pace despite the best efforts of many laboratories. Technical mistakes can and do occur at nearly every stage of a genomics assay (i.e. cell line contamination, reagent swapping, tube mislabelling, etc.) and are often difficult to identify post-execution. However, the DNA sequenced in genomic experiments contains certain markers (e.g. indels) encoded within and can often be ascertained forensically from experimental datasets. We developed the Genotype validation Pipeline (GenoPipe), a suite of heuristic tools that operate together directly on raw and aligned sequencing data from individual high-throughput sequencing experiments to characterize the underlying genome of the source material. We demonstrate how GenoPipe validates and rescues erroneously annotated experiments by identifying unique markers inherent to an organism's genome (i.e. epitope insertions, gene deletions and SNPs).
Graphical Abstract
Graphical Abstract Confidence in experimental results is critical for discovery. As the scale of data generation in genomics has grown exponentially, experimental error has likely kept pace despite the best efforts of many laboratories. Technical mistakes can and do occur at nearly every stage of a genomics assay (i.e. cell line contamination, reagent swapping, tube mislabelling, etc.) and are often difficult to identify post-execution. However, the DNA sequenced in genomic experiments contains certain markers (e.g. indels) encoded within and can often be ascertained forensically from experimental datasets. We developed the Genotype validation Pipeline (GenoPipe), a suite of heuristic tools that operate together directly on raw and aligned sequencing data from individual high-throughput sequencing experiments to characterize the underlying genome of the source material. We demonstrate how GenoPipe validates and rescues erroneously annotated experiments by identifying unique markers inherent to an organism's genome (i.e. epitope insertions, gene deletions and SNPs). Confidence in experimental results is critical for discovery. As the scale of data generation in genomics has grown exponentially, experimental error has likely kept pace despite the best efforts of many laboratories. Technical mistakes can and do occur at nearly every stage of a genomics assay (i.e. cell line contamination, reagent swapping, tube mislabelling, etc.) and are often difficult to identify post-execution. However, the DNA sequenced in genomic experiments contains certain markers (e.g. indels) encoded within and can often be ascertained forensically from experimental datasets. We developed the Genotype validation Pipeline (GenoPipe), a suite of heuristic tools that operate together directly on raw and aligned sequencing data from individual high-throughput sequencing experiments to characterize the underlying genome of the source material. We demonstrate how GenoPipe validates and rescues erroneously annotated experiments by identifying unique markers inherent to an organism's genome (i.e. epitope insertions, gene deletions and SNPs). Abstract Confidence in experimental results is critical for discovery. As the scale of data generation in genomics has grown exponentially, experimental error has likely kept pace despite the best efforts of many laboratories. Technical mistakes can and do occur at nearly every stage of a genomics assay (i.e. cell line contamination, reagent swapping, tube mislabelling, etc.) and are often difficult to identify post-execution. However, the DNA sequenced in genomic experiments contains certain markers (e.g. indels) encoded within and can often be ascertained forensically from experimental datasets. We developed the Genotype validation Pipeline (GenoPipe), a suite of heuristic tools that operate together directly on raw and aligned sequencing data from individual high-throughput sequencing experiments to characterize the underlying genome of the source material. We demonstrate how GenoPipe validates and rescues erroneously annotated experiments by identifying unique markers inherent to an organism's genome (i.e. epitope insertions, gene deletions and SNPs). Confidence in experimental results is critical for discovery. As the scale of data generation in genomics has grown exponentially, experimental error has likely kept pace despite the best efforts of many laboratories. Technical mistakes can and do occur at nearly every stage of a genomics assay (i.e. cell line contamination, reagent swapping, tube mislabelling, etc.) and are often difficult to identify post-execution. However, the DNA sequenced in genomic experiments contains certain markers (e.g. indels) encoded within and can often be ascertained forensically from experimental datasets. We developed the Genotype validation Pipeline (GenoPipe), a suite of heuristic tools that operate together directly on raw and aligned sequencing data from individual high-throughput sequencing experiments to characterize the underlying genome of the source material. We demonstrate how GenoPipe validates and rescues erroneously annotated experiments by identifying unique markers inherent to an organism's genome (i.e. epitope insertions, gene deletions and SNPs). Graphical Abstract |
Author | Pugh, B Franklin Srivastava, Divyanshi Lang, Olivia W Lai, William K M |
Author_xml | – sequence: 1 givenname: Olivia W surname: Lang fullname: Lang, Olivia W – sequence: 2 givenname: Divyanshi surname: Srivastava fullname: Srivastava, Divyanshi – sequence: 3 givenname: B Franklin orcidid: 0000-0001-8341-4476 surname: Pugh fullname: Pugh, B Franklin – sequence: 4 givenname: William K M orcidid: 0000-0003-4351-7037 surname: Lai fullname: Lai, William K M email: wkl29@cornell.edu |
BackLink | https://www.ncbi.nlm.nih.gov/pubmed/37933851$$D View this record in MEDLINE/PubMed |
BookMark | eNp9kMFLwzAUxoMouqkn79KTTKQuL0mbxouI6BQGetBziO1bF92S2nTK_nsjm0Mvnr7D9-P3Hl-fbDvvkJAjoOdAFR860w7rN1OpjG6RHvCcpULlbJv0KKdZClQUe6QfwiulICATu2SPS8V5kUGP3I_Q-Ufb4EViK3SdnSytq5Nuikkdm27ZYOIniW9tbV3yabtpjAE29vS7ntsyqUxnAnbhgOxMzCzg4Tr3yfPtzdP1XTp-GN1fX43Tkkvo0kIWLwXn1DAj0VBZVchZWSFgxZUCVDyXkCse66ykJsullIJzYBljUFDk--Ry5W0WL3Osyvh0a2a6ae3ctEvtjdV_G2enuvYfGqgEEEJFw2BtaP37AkOn5zaUOJsZh34RNCuKXAkmACJ6tkLL1ofQ4mRzB6j-Xl_H9fV6_Ugf_35tw_7MHYGTFeAXzb-mL2nnkCQ |
Cites_doi | 10.1038/ng.806 10.1073/pnas.93.3.1156 10.1038/s41586-021-03314-8 10.1101/125724 10.1101/cshperspect.a006890 10.1038/nrc775 10.1038/nature11247 10.1093/database/baw074 10.1016/j.molcel.2015.05.004 10.1186/s12915-020-0748-z 10.1126/science.6451928 10.1093/bioinformatics/btq033 10.3389/fgene.2014.00111 10.1002/cpmb.104 10.15252/embj.201695621 10.1093/nar/gkz1062 10.1534/genetics.114.161620 10.1038/nrg2626 10.1038/nmeth.1334 10.1093/bioinformatics/btp352 10.1371/journal.pone.0186281 10.1038/s41746-019-0079-z 10.1093/nargab/lqaa060 10.1371/journal.pone.0171435 10.1016/j.molcel.2008.07.020 10.2144/000112598 10.1371/journal.pone.0116218 10.1038/s41586-020-2649-2 10.1038/nmeth.1923 10.1093/nar/gky594 10.1126/science.1232033 10.15252/embr.201744876 10.1126/science.1231143 10.1016/j.celrep.2013.08.016 10.1002/cpmb.59 10.1007/s10565-007-9019-9 10.1016/j.celrep.2017.01.022 10.1126/science.285.5429.901 10.1186/s12859-018-2512-8 10.1038/nbt1008-1113 10.1038/nrc2852 10.1093/bioinformatics/btp373 10.1101/gr.136184.111 10.1534/genetics.110.120717 10.1038/nature00935 10.1038/nature02046 10.1038/s41586-019-1549-9 10.1186/s12864-018-4703-0 10.1371/journal.pbio.1002476 10.1038/s41592-019-0686-2 10.1534/genetics.107.076216 |
ContentType | Journal Article |
Copyright | The Author(s) 2023. Published by Oxford University Press on behalf of Nucleic Acids Research. 2023 The Author(s) 2023. Published by Oxford University Press on behalf of Nucleic Acids Research. |
Copyright_xml | – notice: The Author(s) 2023. Published by Oxford University Press on behalf of Nucleic Acids Research. 2023 – notice: The Author(s) 2023. Published by Oxford University Press on behalf of Nucleic Acids Research. |
DBID | TOX NPM AAYXX CITATION 7X8 5PM |
DOI | 10.1093/nar/gkad950 |
DatabaseName | OUP_牛津大学出版社OA刊 PubMed CrossRef MEDLINE - Academic PubMed Central (Full Participant titles) |
DatabaseTitle | PubMed CrossRef MEDLINE - Academic |
DatabaseTitleList | PubMed MEDLINE - Academic CrossRef |
Database_xml | – sequence: 1 dbid: NPM name: PubMed url: https://proxy.k.utb.cz/login?url=http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=PubMed sourceTypes: Index Database – sequence: 2 dbid: TOX name: OUP_牛津大学出版社OA刊 url: https://academic.oup.com/journals/ sourceTypes: Publisher |
DeliveryMethod | fulltext_linktorsrc |
Discipline | Anatomy & Physiology Chemistry |
EISSN | 1362-4962 |
EndPage | 12068 |
ExternalDocumentID | 10_1093_nar_gkad950 37933851 10.1093/nar/gkad950 |
Genre | Journal Article |
GrantInformation_xml | – fundername: NIEHS NIH HHS grantid: R01 ES034353 – fundername: NIH HHS grantid: R01ES034353 – fundername: ; grantid: BIO220026 – fundername: ; grantid: R01ES034353 |
GroupedDBID | --- -DZ -~X .55 .GJ .I3 0R~ 123 18M 1TH 29N 2WC 3O- 4.4 482 53G 5VS 5WA 6.Y 70E 85S A8Z AAFWJ AAHBH AAMVS AAOGV AAPPN AAPXW AAUQX AAVAP AAWDT AAYJJ ABPTD ABQLI ABQTQ ABSAR ABSMQ ABXVV ACFRR ACGFO ACGFS ACIPB ACIWK ACMRT ACNCT ACPQN ACPRK ACUTJ ACZBC ADBBV ADHZD AEGXH AEKPW AENEX AENZO AFFNX AFPKN AFRAH AFSHK AFULF AFYAG AGKRT AGMDO AHMBA AIAGR ALMA_UNASSIGNED_HOLDINGS ALUQC ANFBD AOIJS AQDSO ASAOO ASPBG ATDFG ATTQO AVWKF AZFZN BAWUL BAYMD BCNDV BEYMZ BTTYL C1A CAG CIDKT COF CS3 CXTWN CZ4 D0S DFGAJ DIK DU5 D~K E3Z EBD EBS EJD ELUNK EMOBN ESTFP F20 F5P FEDTE GROUPED_DOAJ GX1 H13 HH5 HVGLF HYE HZ~ H~9 IH2 KAQDR KC5 KQ8 KSI M49 MBTAY MVM M~E NTWIH NU- OAWHX OBC OBS OEB OES OJQWA OVD O~Y P2P PB- PEELM PQQKQ QBD R44 RD5 RNI RNS ROL ROX ROZ RPM RXO RZF RZO SJN SV3 TCN TEORI TN5 TOX TR2 UHB WG7 WOQ X7H X7M XSB XSW YSK ZKX ZXP ~91 ~D7 ~KM ABEJV NPM AAYXX CITATION 7X8 5PM |
ID | FETCH-LOGICAL-c371t-878b8330a2a7ea07dde32cde1ed3991e936716932a75c0a5677743312522180e3 |
IEDL.DBID | RPM |
ISSN | 0305-1048 |
IngestDate | Tue Sep 17 21:29:20 EDT 2024 Fri Oct 25 04:57:23 EDT 2024 Fri Aug 23 03:39:34 EDT 2024 Sat Nov 02 12:25:54 EDT 2024 Wed Aug 28 03:16:11 EDT 2024 |
IsDoiOpenAccess | true |
IsOpenAccess | true |
IsPeerReviewed | true |
IsScholarly | true |
Issue | 22 |
Language | English |
License | This is an Open Access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited. The Author(s) 2023. Published by Oxford University Press on behalf of Nucleic Acids Research. |
LinkModel | DirectLink |
MergedId | FETCHMERGED-LOGICAL-c371t-878b8330a2a7ea07dde32cde1ed3991e936716932a75c0a5677743312522180e3 |
Notes | ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 23 |
ORCID | 0000-0001-8341-4476 0000-0003-4351-7037 |
OpenAccessLink | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10711449/ |
PMID | 37933851 |
PQID | 2886942411 |
PQPubID | 23479 |
PageCount | 15 |
ParticipantIDs | pubmedcentral_primary_oai_pubmedcentral_nih_gov_10711449 proquest_miscellaneous_2886942411 crossref_primary_10_1093_nar_gkad950 pubmed_primary_37933851 oup_primary_10_1093_nar_gkad950 |
PublicationCentury | 2000 |
PublicationDate | 2023-12-11 |
PublicationDateYYYYMMDD | 2023-12-11 |
PublicationDate_xml | – month: 12 year: 2023 text: 2023-12-11 day: 11 |
PublicationDecade | 2020 |
PublicationPlace | England |
PublicationPlace_xml | – name: England |
PublicationTitle | Nucleic acids research |
PublicationTitleAlternate | Nucleic Acids Res |
PublicationYear | 2023 |
Publisher | Oxford University Press |
Publisher_xml | – name: Oxford University Press |
References | 36993164 - bioRxiv. 2023 Mar 15 Hunter (2023121116522245600_B5) 2017; 18 Stupple (2023121116522245600_B6) 2019; 2 Hughes (2023121116522245600_B11) 2007; 43 Reuter (2023121116522245600_B3) 2015; 58 Shetty (2023121116522245600_B31) 2019; 128 Trivedi (2023121116522245600_B15) 2014; 5 Masters (2023121116522245600_B9) 2002; 2 de Jonge (2023121116522245600_B53) 2017; 36 Mohammad (2023121116522245600_B39) 2019; 20 Slatko (2023121116522245600_B4) 2018; 122 Nardone (2023121116522245600_B10) 2007; 23 Song (2023121116522245600_B48) 2016; 2016 National Institutes of Health (2023121116522245600_B17) 2007 Landt (2023121116522245600_B14) 2012; 22 Kim (2023121116522245600_B24) 1996; 93 Sinha (2023121116522245600_B55) 2017 Didion (2023121116522245600_B20) 2014; 15 Langmead (2023121116522245600_B43) 2012; 9 Horbach (2023121116522245600_B13) 2017; 12 Kircher (2023121116522245600_B8) 2011; 12 Mali (2023121116522245600_B26) 2013; 339 Snapp (2023121116522245600_B29) 2005; 21 Li (2023121116522245600_B47) 2009; 25 Dirks (2023121116522245600_B19) 2004; 88 Metzker (2023121116522245600_B2) 2010; 11 Chan (2023121116522245600_B37) 2018; 19 Winzeler (2023121116522245600_B33) 1999; 285 Schloss (2023121116522245600_B1) 2008; 26 Almeida (2023121116522245600_B22) 2016; 14 Endrullat (2023121116522245600_B16) 2016; 10 Rossi (2023121116522245600_B54) 2021; 592 Giaever (2023121116522245600_B58) 2014; 197 Li (2023121116522245600_B42) 2013 Costello (2023121116522245600_B59) 2018; 19 Haruki (2023121116522245600_B30) 2008; 31 Ghaemmaghami (2023121116522245600_B28) 2003; 425 Cong (2023121116522245600_B27) 2013; 339 Koboldt (2023121116522245600_B40) 2009; 25 Giaever (2023121116522245600_B57) 2002; 418 ENCODE Project Consortium (2023121116522245600_B32) 2012; 489 Quinlan (2023121116522245600_B44) 2010; 26 Craigie (2023121116522245600_B56) 2012; 2 American Type Culture Collection Standards Development Organization Workgroup, A.S.N. (2023121116522245600_B12) 2010; 10 Luo (2023121116522245600_B49) 2020; 48 Bosque (2023121116522245600_B50) 2017; 18 Liang-Chu (2023121116522245600_B21) 2015; 10 Chen (2023121116522245600_B23) 2020; 2 Cai (2023121116522245600_B52) 2013; 4 Ejsmont (2023121116522245600_B35) 2009; 6 Fasterius (2023121116522245600_B41) 2017; 12 Christian (2023121116522245600_B25) 2010; 186 DePristo (2023121116522245600_B38) 2011; 43 Goig (2023121116522245600_B7) 2020; 18 Ryder (2023121116522245600_B34) 2007; 177 Virtanen (2023121116522245600_B46) 2020; 17 Legrand (2023121116522245600_B36) 2018; 46 Nelson-Rees (2023121116522245600_B18) 1981; 212 Harris (2023121116522245600_B45) 2020; 585 Puddu (2023121116522245600_B51) 2019; 573 |
References_xml | – volume: 43 start-page: 491 year: 2011 ident: 2023121116522245600_B38 article-title: A framework for variation discovery and genotyping using next-generation DNA sequencing data publication-title: Nat. Genet. doi: 10.1038/ng.806 contributor: fullname: DePristo – volume: 93 start-page: 1156 year: 1996 ident: 2023121116522245600_B24 article-title: Hybrid restriction enzymes: zinc finger fusions to Fok I cleavage domain publication-title: Proc. Natl. Acad. Sci. U.S.A. doi: 10.1073/pnas.93.3.1156 contributor: fullname: Kim – volume: 592 start-page: 309 year: 2021 ident: 2023121116522245600_B54 article-title: A high-resolution protein architecture of the budding yeast genome publication-title: Nature doi: 10.1038/s41586-021-03314-8 contributor: fullname: Rossi – year: 2017 ident: 2023121116522245600_B55 article-title: Index switching causes “spreading-of-signal” among multiplexed samples in Illumina HiSeq 4000 DNA sequencing doi: 10.1101/125724 contributor: fullname: Sinha – year: 2013 ident: 2023121116522245600_B42 article-title: Aligning sequence reads, clone sequences and assembly contigs with BWA-MEM contributor: fullname: Li – volume: 2 start-page: a006890 year: 2012 ident: 2023121116522245600_B56 article-title: HIV DNA integration publication-title: Cold Spring Harb. Perspect. Med. doi: 10.1101/cshperspect.a006890 contributor: fullname: Craigie – volume: 10 start-page: 2 year: 2016 ident: 2023121116522245600_B16 article-title: Standardization and quality management in next-generation sequencing publication-title: Appl. Transl. Genom. contributor: fullname: Endrullat – volume: 2 start-page: 315 year: 2002 ident: 2023121116522245600_B9 article-title: HeLa cells 50 years on: the good, the bad and the ugly publication-title: Nat. Rev. Cancer doi: 10.1038/nrc775 contributor: fullname: Masters – volume: 489 start-page: 57 year: 2012 ident: 2023121116522245600_B32 article-title: An integrated encyclopedia of DNA elements in the human genome publication-title: Nature doi: 10.1038/nature11247 contributor: fullname: ENCODE Project Consortium – volume: 2016 start-page: baw074 year: 2016 ident: 2023121116522245600_B48 article-title: Integration of new alternative reference strain genome sequences into the Saccharomyces genome database publication-title: Database (Oxford) doi: 10.1093/database/baw074 contributor: fullname: Song – volume: 58 start-page: 586 year: 2015 ident: 2023121116522245600_B3 article-title: High-throughput sequencing technologies publication-title: Mol. Cell doi: 10.1016/j.molcel.2015.05.004 contributor: fullname: Reuter – volume: 18 start-page: 24 year: 2020 ident: 2023121116522245600_B7 article-title: Contaminant DNA in bacterial sequencing experiments is a major source of false genetic variability publication-title: BMC Biol. doi: 10.1186/s12915-020-0748-z contributor: fullname: Goig – volume: 12 start-page: 382 year: 2011 ident: 2023121116522245600_B8 article-title: Addressing challenges in the production and analysis of illumina sequencing data publication-title: Bmc Genomics [Electronic Resource] contributor: fullname: Kircher – volume: 212 start-page: 446 year: 1981 ident: 2023121116522245600_B18 article-title: Cross-contamination of cells in culture publication-title: Science doi: 10.1126/science.6451928 contributor: fullname: Nelson-Rees – volume: 26 start-page: 841 year: 2010 ident: 2023121116522245600_B44 article-title: BEDTools: a flexible suite of utilities for comparing genomic features publication-title: Bioinformatics doi: 10.1093/bioinformatics/btq033 contributor: fullname: Quinlan – volume: 5 start-page: 111 year: 2014 ident: 2023121116522245600_B15 article-title: Quality control of next-generation sequencing data without a reference publication-title: Front. Genet. doi: 10.3389/fgene.2014.00111 contributor: fullname: Trivedi – volume: 128 start-page: e104 year: 2019 ident: 2023121116522245600_B31 article-title: Auxin-inducible degron system for depletion of proteins in Saccharomyces cerevisiae publication-title: Curr. Protoc. Mol. Biol. doi: 10.1002/cpmb.104 contributor: fullname: Shetty – volume: 36 start-page: 274 year: 2017 ident: 2023121116522245600_B53 article-title: Molecular mechanisms that distinguish TFIID housekeeping from regulatable SAGA promoters publication-title: EMBO J. doi: 10.15252/embj.201695621 contributor: fullname: de Jonge – volume: 48 start-page: D882 year: 2020 ident: 2023121116522245600_B49 article-title: New developments on the Encyclopedia of DNA Elements (ENCODE) data portal publication-title: Nucleic Acids Res. doi: 10.1093/nar/gkz1062 contributor: fullname: Luo – volume: 197 start-page: 451 year: 2014 ident: 2023121116522245600_B58 article-title: The yeast deletion collection: a decade of functional genomics publication-title: Genetics doi: 10.1534/genetics.114.161620 contributor: fullname: Giaever – volume: 11 start-page: 31 year: 2010 ident: 2023121116522245600_B2 article-title: Sequencing technologies - the next generation publication-title: Nat. Rev. Genet. doi: 10.1038/nrg2626 contributor: fullname: Metzker – volume: 6 start-page: 435 year: 2009 ident: 2023121116522245600_B35 article-title: A toolkit for high-throughput, cross-species gene engineering in Drosophila publication-title: Nat. Methods doi: 10.1038/nmeth.1334 contributor: fullname: Ejsmont – volume: 25 start-page: 2078 year: 2009 ident: 2023121116522245600_B47 article-title: The Sequence Alignment/Map format and SAMtools publication-title: Bioinformatics doi: 10.1093/bioinformatics/btp352 contributor: fullname: Li – volume: 12 start-page: e0186281 year: 2017 ident: 2023121116522245600_B13 article-title: The ghosts of HeLa: how cell line misidentification contaminates the scientific literature publication-title: PLoS One doi: 10.1371/journal.pone.0186281 contributor: fullname: Horbach – volume: 15 start-page: 847 year: 2014 ident: 2023121116522245600_B20 article-title: SNP array profiling of mouse cell lines identifies their strains of origin and reveals cross-contamination and widespread aneuploidy publication-title: Bmc Genomics [Electronic Resource] contributor: fullname: Didion – volume: 21 start-page: 21.4.1 year: 2005 ident: 2023121116522245600_B29 article-title: Design and use of fluorescent fusion proteins in cell biology publication-title: Curr. Protoc. Cell Biol. contributor: fullname: Snapp – volume: 2 start-page: 2 year: 2019 ident: 2023121116522245600_B6 article-title: The reproducibility crisis in the age of digital medicine publication-title: NPJ Digit. Med. doi: 10.1038/s41746-019-0079-z contributor: fullname: Stupple – volume: 2 start-page: lqaa060 year: 2020 ident: 2023121116522245600_B23 article-title: Authentication, characterization and contamination detection of cell lines, xenografts and organoids by barcode deep NGS sequencing publication-title: NAR Genom Bioinform doi: 10.1093/nargab/lqaa060 contributor: fullname: Chen – volume: 12 start-page: e0171435 year: 2017 ident: 2023121116522245600_B41 article-title: A novel RNA sequencing data analysis method for cell line authentication publication-title: PLoS One doi: 10.1371/journal.pone.0171435 contributor: fullname: Fasterius – volume: 31 start-page: 925 year: 2008 ident: 2023121116522245600_B30 article-title: The anchor-away technique: rapid, conditional establishment of yeast mutant phenotypes publication-title: Mol. Cell doi: 10.1016/j.molcel.2008.07.020 contributor: fullname: Haruki – volume: 20 start-page: 81 year: 2019 ident: 2023121116522245600_B39 article-title: CeL-ID: cell line identification using RNA-seq data publication-title: Bmc Genomics [Electronic Resource] contributor: fullname: Mohammad – volume: 43 start-page: 575 year: 2007 ident: 2023121116522245600_B11 article-title: The costs of using unauthenticated, over-passaged cell lines: how much more data do we need? publication-title: BioTechniques doi: 10.2144/000112598 contributor: fullname: Hughes – volume: 10 start-page: e0116218 year: 2015 ident: 2023121116522245600_B21 article-title: Human biosample authentication using the high-throughput, cost-effective SNPtrace(TM) system publication-title: PLoS One doi: 10.1371/journal.pone.0116218 contributor: fullname: Liang-Chu – volume: 585 start-page: 357 year: 2020 ident: 2023121116522245600_B45 article-title: Array programming with NumPy publication-title: Nature doi: 10.1038/s41586-020-2649-2 contributor: fullname: Harris – volume: 9 start-page: 357 year: 2012 ident: 2023121116522245600_B43 article-title: Fast gapped-read alignment with Bowtie 2 publication-title: Nat. Methods doi: 10.1038/nmeth.1923 contributor: fullname: Langmead – volume: 46 start-page: 6935 year: 2018 ident: 2023121116522245600_B36 article-title: Generating genomic platforms to study Candida albicans pathogenesis publication-title: Nucleic Acids Res. doi: 10.1093/nar/gky594 contributor: fullname: Legrand – volume: 339 start-page: 823 year: 2013 ident: 2023121116522245600_B26 article-title: RNA-guided human genome engineering via Cas9 publication-title: Science doi: 10.1126/science.1232033 contributor: fullname: Mali – volume: 18 start-page: 1493 year: 2017 ident: 2023121116522245600_B5 article-title: The reproducibility “crisis”: reaction to replication crisis should not stifle innovation publication-title: EMBO Rep. doi: 10.15252/embr.201744876 contributor: fullname: Hunter – volume: 339 start-page: 819 year: 2013 ident: 2023121116522245600_B27 article-title: Multiplex genome engineering using CRISPR/Cas systems publication-title: Science doi: 10.1126/science.1231143 contributor: fullname: Cong – volume: 4 start-page: 1063 year: 2013 ident: 2023121116522245600_B52 article-title: Integration of multiple nutrient cues and regulation of lifespan by ribosomal transcription factor Ifh1 publication-title: Cell Rep. doi: 10.1016/j.celrep.2013.08.016 contributor: fullname: Cai – volume: 122 start-page: e59 year: 2018 ident: 2023121116522245600_B4 article-title: Overview of next-generation sequencing technologies publication-title: Curr Protoc Mol Biol doi: 10.1002/cpmb.59 contributor: fullname: Slatko – volume: 23 start-page: 367 year: 2007 ident: 2023121116522245600_B10 article-title: Eradication of cross-contaminated cell lines: a call for action publication-title: Cell Biol. Toxicol. doi: 10.1007/s10565-007-9019-9 contributor: fullname: Nardone – volume: 18 start-page: 1324 year: 2017 ident: 2023121116522245600_B50 article-title: Benzotriazoles reactivate latent HIV-1 through inactivation of STAT5 SUMOylation publication-title: Cell Rep. doi: 10.1016/j.celrep.2017.01.022 contributor: fullname: Bosque – volume: 285 start-page: 901 year: 1999 ident: 2023121116522245600_B33 article-title: Functional characterization of the S. cerevisiae genome by gene deletion and parallel analysis publication-title: Science doi: 10.1126/science.285.5429.901 contributor: fullname: Winzeler – volume: 19 start-page: 478 year: 2018 ident: 2023121116522245600_B37 article-title: A statistical framework for detecting mislabeled and contaminated samples using shallow-depth sequence data publication-title: BMC Bioinf. doi: 10.1186/s12859-018-2512-8 contributor: fullname: Chan – volume: 26 start-page: 1113 year: 2008 ident: 2023121116522245600_B1 article-title: How to get genomes at one ten-thousandth the cost publication-title: Nat. Biotechnol. doi: 10.1038/nbt1008-1113 contributor: fullname: Schloss – volume: 10 start-page: 441 year: 2010 ident: 2023121116522245600_B12 article-title: Cell line misidentification: the beginning of the end publication-title: Nat. Rev. Cancer doi: 10.1038/nrc2852 contributor: fullname: American Type Culture Collection Standards Development Organization Workgroup, A.S.N. – volume: 25 start-page: 2283 year: 2009 ident: 2023121116522245600_B40 article-title: VarScan: variant detection in massively parallel sequencing of individual and pooled samples publication-title: Bioinformatics doi: 10.1093/bioinformatics/btp373 contributor: fullname: Koboldt – volume: 22 start-page: 1813 year: 2012 ident: 2023121116522245600_B14 article-title: ChIP-seq guidelines and practices of the ENCODE and modENCODE consortia publication-title: Genome Res. doi: 10.1101/gr.136184.111 contributor: fullname: Landt – volume: 186 start-page: 757 year: 2010 ident: 2023121116522245600_B25 article-title: Targeting DNA double-strand breaks with TAL effector nucleases publication-title: Genetics doi: 10.1534/genetics.110.120717 contributor: fullname: Christian – volume: 418 start-page: 387 year: 2002 ident: 2023121116522245600_B57 article-title: Functional profiling of the Saccharomyces cerevisiae genome publication-title: Nature doi: 10.1038/nature00935 contributor: fullname: Giaever – volume: 425 start-page: 737 year: 2003 ident: 2023121116522245600_B28 article-title: Global analysis of protein expression in yeast publication-title: Nature doi: 10.1038/nature02046 contributor: fullname: Ghaemmaghami – volume: 573 start-page: 416 year: 2019 ident: 2023121116522245600_B51 article-title: Genome architecture and stability in the Saccharomyces cerevisiae knockout collection publication-title: Nature doi: 10.1038/s41586-019-1549-9 contributor: fullname: Puddu – volume: 19 start-page: 332 year: 2018 ident: 2023121116522245600_B59 article-title: Characterization and remediation of sample index swaps by non-redundant dual indexing on massively parallel sequencing platforms publication-title: Bmc Genomics (Electronic Resource) doi: 10.1186/s12864-018-4703-0 contributor: fullname: Costello – volume: 14 start-page: e1002476 year: 2016 ident: 2023121116522245600_B22 article-title: Standards for cell line authentication and beyond publication-title: PLoS Biol. doi: 10.1371/journal.pbio.1002476 contributor: fullname: Almeida – volume: 17 start-page: 261 year: 2020 ident: 2023121116522245600_B46 article-title: SciPy 1.0: fundamental algorithms for scientific computing in Python publication-title: Nat. Methods doi: 10.1038/s41592-019-0686-2 contributor: fullname: Virtanen – year: 2007 ident: 2023121116522245600_B17 article-title: Notice Regarding Authentication of Cultured Cell Lines contributor: fullname: National Institutes of Health – volume: 88 start-page: 43 year: 2004 ident: 2023121116522245600_B19 article-title: Authentication of cancer cell lines by DNA fingerprinting publication-title: Methods Mol. Med. contributor: fullname: Dirks – volume: 177 start-page: 615 year: 2007 ident: 2023121116522245600_B34 article-title: The DrosDel deletion collection: a Drosophila genomewide chromosomal deficiency resource publication-title: Genetics doi: 10.1534/genetics.107.076216 contributor: fullname: Ryder |
SSID | ssj0014154 |
Score | 2.4823463 |
Snippet | Abstract
Confidence in experimental results is critical for discovery. As the scale of data generation in genomics has grown exponentially, experimental error... Confidence in experimental results is critical for discovery. As the scale of data generation in genomics has grown exponentially, experimental error has... |
SourceID | pubmedcentral proquest crossref pubmed oup |
SourceType | Open Access Repository Aggregation Database Index Database Publisher |
StartPage | 12054 |
SubjectTerms | Computational Biology |
Title | GenoPipe: identifying the genotype of origin within (epi)genomic datasets |
URI | https://www.ncbi.nlm.nih.gov/pubmed/37933851 https://search.proquest.com/docview/2886942411 https://pubmed.ncbi.nlm.nih.gov/PMC10711449 |
Volume | 51 |
hasFullText | 1 |
inHoldings | 1 |
isFullTextHit | |
isPrint | |
link | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwnV1LT8MwDLZgF7gg3oxnkBCCQ2nTtF3KDU08JR4HkHar0tRABcsmth349zhpixgHDkhVe0haVbYVf07szwAHhAgU-U30cmm3blCmnsy59mwDuCAq0iJ0bPu3d8nVU3TTi3szkDS1MC5pX-fliXnvn5jy1eVWDvvab_LE_IfbLoUsBOOj1J-FWbLQJkavzw7IJVWkUY5jM5J1VR6F7r5RH_7LmyrS2HaAE2SbQsZ8yiVNlbn9QJu_kyZ_eKGLRVio4SM7q35zCWbQLMPKmaHQuf_JDplL6HQ75csw122aua3A9SWawUM5xFNWutJcV97ECP0xy9JqN2LZ4JlVbbKY3ZylxxEOy2M73C81s6mkIxyPVuHp4vyxe-XVXRQ8LTp8TMudzKUQgQpVB1XQofVMhLpAjgWBE46pSCxjjqDhWAcqTjqECIUgBYbk_gMUa9AyA4MbwCz44YKrVJD0o2clMdC6SDAJ6EoTbMNBI8hsWJFlZNUht8hI9Fkt-jbskZD_nrHfKCAjQdkzDGVwMBlloZRJGhHs4G1YrxTy_aFGn22QU6r6nmCptKdHyMIcpXZjUZv_f3UL5m0repvqwvk2tMYfE9whwDLOd12gv-uslO6P970vpXnrWg |
link.rule.ids | 230,315,730,783,787,867,888,1607,27936,27937,53804,53806 |
linkProvider | National Library of Medicine |
linkToHtml | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwnV1RT9swED5B9wAvaFBgZWMYCSF4CInjJHX2VlVj7WgrHkDiLXKcK4u2uhUtD_v3OzsJonvgASlSHuxE0d3F9_l89x3AGSECRX4TvVza0A3K1JM5155tABdERVqEjm1_PEkG99HPh_hhA5KmFsYl7eu8vDJ_Zlem_OVyKxcz7Td5Yv7tuE9bFoLxUepvwgf6YYOo2aXXpwfklCraKMeyGcm6Lo82775RT_7jb1Wkse0BJ8g6hYz5mlNaK3R7hTf_T5t85YeuP8JODSBZr_rQXdhAswftnqHN8-wvO2cupdPFyvdgq9-0c2vD8Aea-W25wG-sdMW5rsCJEf5jlqfVhmLZfMqqRlnMhmfpdoGL8tIOz0rNbDLpElfLfbi__n7XH3h1HwVPiy5f0YIncylEoELVRRV0aUUToS6QY0HwhGMqEsuZI2g41oGKky5hQiFIhSEBgADFAbTM3OAnYBb-cMFVKkj-0VRJDLQuEkwCutIEO3DWCDJbVHQZWXXMLTISfVaLvgMnJOS3Z5w2CshIUPYUQxmcPy-zUMokjQh48A4cVgp5eVGjzw7INVW9TLBk2usjZGOOVLuxqaP3P3oCW4O78SgbDSc3n2HbNqa3iS-cf4HW6ukZjwm-rPKvzlb_ARya7J8 |
linkToPdf | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwnV1LT9wwEB61ILVcKl6FbQtrJIToISSOk6zDbbV0eRW0B5C4RY4zC1G73ohdDvx7xk6CWA49VIqUg50ompl4Po-_mQHYJ0SgyG-il0sbukGZejLn2rMN4IKoSIvQVdu_uk7ObqOLu_iuYVXOGlql0Xl5ZP5Ojkz54LiV1UT7LU_MH10NaMtCMD5K_aoY-x9hmX7aIGl36s0JAjmmunSUq7QZySY3jzbwvlGP_v0fVaSx7QMnyEKFjPmCY1pIdnuDOd9TJ9_4ouEqfGlAJOvXH7sGH9Csw0bf0AZ68swOmKN1unj5OnwetC3dNuD8FM10VFZ4zEqXoOuSnBhhQGZrtdpwLJuOWd0si9kQLd0OsSp_2uFJqZkllM5wPtuE2-Gvm8GZ1_RS8LTo8TktejKXQgQqVD1UQY9WNRHqAjkWBFE4piKxdXMEDcc6UHHSI1woBKkxJBAQoPgKS2ZqcBuYhUBccJUK0kE0VhIDrYsEk4CuNMEO7LeCzKq6ZEZWH3WLjESfNaLvQJeE_O8Ze60CMhKUPclQBqdPsyyUMkkjAh-8A1u1Ql5f1OqzA3JBVa8TbEHtxRGyM1dYu7Wrb___aBc-jU6G2e_z68vvsGJ701vuC-c_YGn--IQ7hGDm-a4z1ReO6O2y |
openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=GenoPipe%3A+identifying+the+genotype+of+origin+within+%28epi%29genomic+datasets&rft.jtitle=Nucleic+acids+research&rft.au=Lang%2C+Olivia+W&rft.au=Srivastava%2C+Divyanshi&rft.au=Pugh%2C+B+Franklin&rft.au=Lai%2C+William+K+M&rft.date=2023-12-11&rft.pub=Oxford+University+Press&rft.issn=0305-1048&rft.eissn=1362-4962&rft.volume=51&rft.issue=22&rft.spage=12054&rft.epage=12068&rft_id=info:doi/10.1093%2Fnar%2Fgkad950&rft_id=info%3Apmid%2F37933851&rft.externalDBID=PMC10711449 |
thumbnail_l | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=0305-1048&client=summon |
thumbnail_m | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=0305-1048&client=summon |
thumbnail_s | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=0305-1048&client=summon |