Quality Control and Integration of Genotypes from Two Calling Pipelines for Whole Genome Sequence Data in the Alzheimer's Disease Sequencing Project

The Alzheimer's Disease Sequencing Project (ADSP) performed whole genome sequencing (WGS) of 584 subjects from 111 multiplex families at three sequencing centers. Genotype calling of single nucleotide variants (SNVs) and insertion-deletion variants (indels) was performed centrally using GATK-Ha...

Full description

Saved in:
Bibliographic Details
Published inbioRxiv
Main Authors Naj, Adam C, Lin, Honghuang, Vardarajan, Badri N, White, Simon, Lancour, Daniel, Ma, Yiyi, Schmidt, Michael, Sun, Fangui, Butkiewicz, Mariusz, Bush, William S, Kunkle, Brian W, Malamon, John, Amin, Najaf, Choi, Seung H, Hamilton-Nelson, Kara L, Sven J Van Der Lee, Gupta, Namrata, Koboldt, Daniel C, Saad, Mohamad, Bowen, Wang, Nato, Alejandro Q, Sohi, Harkirat K, Kuzma, Amanda, Alzheimer's Disease Sequencing Project (Adsp), Li-San, Wang, Cupples, L Adrienne, Cornelia Van Duijn, Seshadri, Sudha, Schellenberg, Gerard D, Boerwinkle, Eric, Bis, Joshua C, Dupuis, Josee, Salerno, William J, Wijsman, Ellen M, Eden, Martin, Destefano, Anita L
Format Paper
LanguageEnglish
Published Cold Spring Harbor Cold Spring Harbor Laboratory Press 11.05.2018
Cold Spring Harbor Laboratory
Edition1.1
Subjects
Online AccessGet full text

Cover

Loading…
Abstract The Alzheimer's Disease Sequencing Project (ADSP) performed whole genome sequencing (WGS) of 584 subjects from 111 multiplex families at three sequencing centers. Genotype calling of single nucleotide variants (SNVs) and insertion-deletion variants (indels) was performed centrally using GATK-HaplotypeCaller and Atlas V2. The ADSP Quality Control (QC) Working Group applied QC protocols to project-level variant call format files (VCFs) from each pipeline, and developed and implemented a novel protocol, termed consensus calling, to combine genotype calls from both pipelines into a single high-quality set. QC was applied to autosomal bi-allelic SNVs and indels, and included pipeline-recommended QC filters, variant-level QC, and sample-level QC. Low-quality variants or genotypes were excluded, and sample outliers were noted. Quality was assessed by examining Mendelian inconsistencies (MIs) among 67 parent-offspring pairs, and MIs were used to establish additional genotype-specific filters for GATK calls. After QC, 578 subjects remained. Pipeline-specific QC excluded ~12.0% of GATK and 14.5% of Atlas SNVs. Between pipelines, ~91% of SNV genotypes across all QCed variants were concordant; 4.23% and 4.56% of genotypes were exclusive to Atlas or GATK, respectively; the remaining ~0.01% of discordant genotypes were excluded. For indels, variant-level QC excluded ~36.8% of GATK and 35.3% of Atlas indels. Between pipelines, ~55.6% of indel genotypes were concordant; while 10.3% and 28.3% were exclusive to Atlas or GATK, respectively; and ~0.29% of discordant genotypes were. The final WGS consensus dataset contains 27,896,774 SNVs and 3,133,926 indels and is publicly available.
AbstractList The Alzheimer's Disease Sequencing Project (ADSP) performed whole genome sequencing (WGS) of 584 subjects from 111 multiplex families at three sequencing centers. Genotype calling of single nucleotide variants (SNVs) and insertion-deletion variants (indels) was performed centrally using GATK-HaplotypeCaller and Atlas V2. The ADSP Quality Control (QC) Working Group applied QC protocols to project-level variant call format files (VCFs) from each pipeline, and developed and implemented a novel protocol, termed consensus calling, to combine genotype calls from both pipelines into a single high-quality set. QC was applied to autosomal bi-allelic SNVs and indels, and included pipeline-recommended QC filters, variant-level QC, and sample-level QC. Low-quality variants or genotypes were excluded, and sample outliers were noted. Quality was assessed by examining Mendelian inconsistencies (MIs) among 67 parent-offspring pairs, and MIs were used to establish additional genotype-specific filters for GATK calls. After QC, 578 subjects remained. Pipeline-specific QC excluded ~12.0% of GATK and 14.5% of Atlas SNVs. Between pipelines, ~91% of SNV genotypes across all QCed variants were concordant; 4.23% and 4.56% of genotypes were exclusive to Atlas or GATK, respectively; the remaining ~0.01% of discordant genotypes were excluded. For indels, variant-level QC excluded ~36.8% of GATK and 35.3% of Atlas indels. Between pipelines, ~55.6% of indel genotypes were concordant; while 10.3% and 28.3% were exclusive to Atlas or GATK, respectively; and ~0.29% of discordant genotypes were. The final WGS consensus dataset contains 27,896,774 SNVs and 3,133,926 indels and is publicly available.
The Alzheimer’s Disease Sequencing Project (ADSP) performed whole genome sequencing (WGS) of 584 subjects from 111 multiplex families at three sequencing centers. Genotype calling of single nucleotide variants (SNVs) and insertion-deletion variants (indels) was performed centrally using GATK-HaplotypeCaller and Atlas V2. The ADSP Quality Control (QC) Working Group applied QC protocols to project-level variant call format files (VCFs) from each pipeline, and developed and implemented a novel protocol, termed “consensus calling,” to combine genotype calls from both pipelines into a single high-quality set. QC was applied to autosomal bi-allelic SNVs and indels, and included pipeline-recommended QC filters, variant-level QC, and sample-level QC. Low-quality variants or genotypes were excluded, and sample outliers were noted. Quality was assessed by examining Mendelian inconsistencies (MIs) among 67 parent-offspring pairs, and MIs were used to establish additional genotype-specific filters for GATK calls. After QC, 578 subjects remained. Pipeline-specific QC excluded ~12.0% of GATK and 14.5% of Atlas SNVs. Between pipelines, ~91% of SNV genotypes across all QCed variants were concordant; 4.23% and 4.56% of genotypes were exclusive to Atlas or GATK, respectively; the remaining ~0.01% of discordant genotypes were excluded. For indels, variant-level QC excluded ~36.8% of GATK and 35.3% of Atlas indels. Between pipelines, ~55.6% of indel genotypes were concordant; while 10.3% and 28.3% were exclusive to Atlas or GATK, respectively; and ~0.29% of discordant genotypes were. The final WGS consensus dataset contains 27,896,774 SNVs and 3,133,926 indels and is publicly available. AD, Alzheimer’s disease; QC, Quality Control; LSSAC, Large-Scale Sequencing and Analysis Center; Broad, Broad Institute Genomics Service; Baylor, Baylor College of Medicine Human Genome Sequencing Center; WashU, Washington University-St. Louis McDonnell Genome Institute; WGS, whole genome sequencing; WES, whole exome sequencing; indel, insertion-deletion variants; VCF, variant control format; MI, Mendelian inconsistency; MC, Mendelian consistency; GWAS, genome-wide association study; VR, referent allele read depth; DP, overall read depth; MS, mapping score; GQ, genotype quality score; Ti/Tv, Transition/Transversion; CS, concordance code
Author Kunkle, Brian W
Cupples, L Adrienne
Schellenberg, Gerard D
Li-San, Wang
Alzheimer's Disease Sequencing Project (Adsp)
White, Simon
Bowen, Wang
Butkiewicz, Mariusz
Destefano, Anita L
Saad, Mohamad
Naj, Adam C
Ma, Yiyi
Sven J Van Der Lee
Boerwinkle, Eric
Schmidt, Michael
Malamon, John
Sohi, Harkirat K
Amin, Najaf
Lancour, Daniel
Nato, Alejandro Q
Sun, Fangui
Lin, Honghuang
Hamilton-Nelson, Kara L
Vardarajan, Badri N
Bush, William S
Seshadri, Sudha
Gupta, Namrata
Bis, Joshua C
Dupuis, Josee
Eden, Martin
Wijsman, Ellen M
Kuzma, Amanda
Choi, Seung H
Salerno, William J
Cornelia Van Duijn
Koboldt, Daniel C
Author_xml – sequence: 1
  givenname: Adam
  surname: Naj
  middlename: C
  fullname: Naj, Adam C
– sequence: 2
  givenname: Honghuang
  surname: Lin
  fullname: Lin, Honghuang
– sequence: 3
  givenname: Badri
  surname: Vardarajan
  middlename: N
  fullname: Vardarajan, Badri N
– sequence: 4
  givenname: Simon
  surname: White
  fullname: White, Simon
– sequence: 5
  givenname: Daniel
  surname: Lancour
  fullname: Lancour, Daniel
– sequence: 6
  givenname: Yiyi
  surname: Ma
  fullname: Ma, Yiyi
– sequence: 7
  givenname: Michael
  surname: Schmidt
  fullname: Schmidt, Michael
– sequence: 8
  givenname: Fangui
  surname: Sun
  fullname: Sun, Fangui
– sequence: 9
  givenname: Mariusz
  surname: Butkiewicz
  fullname: Butkiewicz, Mariusz
– sequence: 10
  givenname: William
  surname: Bush
  middlename: S
  fullname: Bush, William S
– sequence: 11
  givenname: Brian
  surname: Kunkle
  middlename: W
  fullname: Kunkle, Brian W
– sequence: 12
  givenname: John
  surname: Malamon
  fullname: Malamon, John
– sequence: 13
  givenname: Najaf
  surname: Amin
  fullname: Amin, Najaf
– sequence: 14
  givenname: Seung
  surname: Choi
  middlename: H
  fullname: Choi, Seung H
– sequence: 15
  givenname: Kara
  surname: Hamilton-Nelson
  middlename: L
  fullname: Hamilton-Nelson, Kara L
– sequence: 16
  fullname: Sven J Van Der Lee
– sequence: 17
  givenname: Namrata
  surname: Gupta
  fullname: Gupta, Namrata
– sequence: 18
  givenname: Daniel
  surname: Koboldt
  middlename: C
  fullname: Koboldt, Daniel C
– sequence: 19
  givenname: Mohamad
  surname: Saad
  fullname: Saad, Mohamad
– sequence: 20
  givenname: Wang
  surname: Bowen
  fullname: Bowen, Wang
– sequence: 21
  givenname: Alejandro
  surname: Nato
  middlename: Q
  fullname: Nato, Alejandro Q
– sequence: 22
  givenname: Harkirat
  surname: Sohi
  middlename: K
  fullname: Sohi, Harkirat K
– sequence: 23
  givenname: Amanda
  surname: Kuzma
  fullname: Kuzma, Amanda
– sequence: 24
  fullname: Alzheimer's Disease Sequencing Project (Adsp)
– sequence: 25
  givenname: Wang
  surname: Li-San
  fullname: Li-San, Wang
– sequence: 26
  givenname: L
  surname: Cupples
  middlename: Adrienne
  fullname: Cupples, L Adrienne
– sequence: 27
  fullname: Cornelia Van Duijn
– sequence: 28
  givenname: Sudha
  surname: Seshadri
  fullname: Seshadri, Sudha
– sequence: 29
  givenname: Gerard
  surname: Schellenberg
  middlename: D
  fullname: Schellenberg, Gerard D
– sequence: 30
  givenname: Eric
  surname: Boerwinkle
  fullname: Boerwinkle, Eric
– sequence: 31
  givenname: Joshua
  surname: Bis
  middlename: C
  fullname: Bis, Joshua C
– sequence: 32
  givenname: Josee
  surname: Dupuis
  fullname: Dupuis, Josee
– sequence: 33
  givenname: William
  surname: Salerno
  middlename: J
  fullname: Salerno, William J
– sequence: 34
  givenname: Ellen
  surname: Wijsman
  middlename: M
  fullname: Wijsman, Ellen M
– sequence: 35
  givenname: Martin
  surname: Eden
  fullname: Eden, Martin
– sequence: 36
  givenname: Anita
  surname: Destefano
  middlename: L
  fullname: Destefano, Anita L
BookMark eNpNkM1KAzEUhYNUsNb6BkLAhavRJNP5W5ZWa6GgYsHlkGRu2pSZZExStT6HD-zYiri6B77Dx-Geop6xBhA6p-SaUkJvYprnSXaE-iwtWJQzkvT-5RM09H5DCGFFSuNs1EdfT1te67DDE2uCszXmpsJzE2DleNDWYKvwDIwNuxY8Vs42ePlu8YTXtTYr_Khb6MIPsg6_rG0N-3oD-Blet2Ak4CkPHGuDwxrwuP5cg27AXXk81R64_yvudc5uQIYzdKx47WH4ewdoeXe7nNxHi4fZfDJeRCJnWZQqBVWRi0LGqQCSZVKmXKhUQkKTpKhUIojMpRrRRPJEUlJkQjBSCYiJICMeD9DFQSu0dR_6rWydbrjblYcndvzywFtnu4k-lBu7daZbVDKSUdYJ0yz-Bt19dKU
ContentType Paper
Copyright 2018. This article is published under http://creativecommons.org/licenses/by-nc-nd/4.0/ ( the License ). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.
2018, Posted by Cold Spring Harbor Laboratory
Copyright_xml – notice: 2018. This article is published under http://creativecommons.org/licenses/by-nc-nd/4.0/ ( the License ). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.
– notice: 2018, Posted by Cold Spring Harbor Laboratory
DBID 8FE
8FH
ABUWG
AFKRA
AZQEC
BBNVY
BENPR
BHPHI
CCPQU
DWQXO
GNUQQ
HCIFZ
LK8
M7P
PHGZM
PHGZT
PIMPY
PKEHL
PQEST
PQGLB
PQQKQ
PQUKI
PRINS
FX.
DOI 10.1101/318857
DatabaseName ProQuest SciTech Collection
ProQuest Natural Science Journals
ProQuest Central (Alumni)
ProQuest Central UK/Ireland
ProQuest Central Essentials
Biological Science Collection
ProQuest Central
Natural Science Collection
ProQuest One Community College
ProQuest Central
ProQuest Central Student
SciTech Premium Collection
Biological Sciences
Biological Science Database
ProQuest Central Premium
ProQuest One Academic
Publicly Available Content Database
ProQuest One Academic Middle East (New)
ProQuest One Academic Eastern Edition (DO NOT USE)
ProQuest One Applied & Life Sciences
ProQuest One Academic
ProQuest One Academic UKI Edition
ProQuest Central China
bioRxiv
DatabaseTitle Publicly Available Content Database
ProQuest Central Student
ProQuest One Academic Middle East (New)
ProQuest Biological Science Collection
ProQuest Central Essentials
ProQuest One Academic Eastern Edition
ProQuest Central (Alumni Edition)
SciTech Premium Collection
ProQuest One Community College
ProQuest Natural Science Collection
Biological Science Database
ProQuest SciTech Collection
ProQuest Central China
ProQuest Central
ProQuest One Applied & Life Sciences
ProQuest One Academic UKI Edition
Natural Science Collection
ProQuest Central Korea
Biological Science Collection
ProQuest Central (New)
ProQuest One Academic
ProQuest One Academic (New)
DatabaseTitleList Publicly Available Content Database

Database_xml – sequence: 1
  dbid: FX.
  name: bioRxiv
  url: https://www.biorxiv.org/
  sourceTypes: Open Access Repository
– sequence: 2
  dbid: BENPR
  name: ProQuest Central
  url: https://www.proquest.com/central
  sourceTypes: Aggregation Database
DeliveryMethod fulltext_linktorsrc
Discipline Biology
EISSN 2692-8205
Edition 1.1
ExternalDocumentID 318857v1
Genre Working Paper/Pre-Print
GroupedDBID 8FE
8FH
ABUWG
AFKRA
ALMA_UNASSIGNED_HOLDINGS
AZQEC
BBNVY
BENPR
BHPHI
CCPQU
DWQXO
GNUQQ
HCIFZ
LK8
M7P
NQS
PHGZM
PHGZT
PIMPY
PKEHL
PQEST
PQGLB
PQQKQ
PQUKI
PRINS
PROAC
RHI
FX.
ID FETCH-LOGICAL-b827-6ffed98b9c36be077cc6abf6ce51559df5b0c8cf415ca5c1097bb20dbe30b04a3
IEDL.DBID FX.
ISSN 2692-8205
IngestDate Tue Jan 07 18:50:45 EST 2025
Fri Jul 25 09:21:43 EDT 2025
IsDoiOpenAccess true
IsOpenAccess true
IsPeerReviewed false
IsScholarly false
Keywords whole genome sequencing
GATK
consensus calling
Atlas
quality control
Mendelian inconsistencies
Language English
License This pre-print is available under a Creative Commons License (Attribution-NonCommercial-NoDerivs 4.0 International), CC BY-NC-ND 4.0, as described at http://creativecommons.org/licenses/by-nc-nd/4.0
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-b827-6ffed98b9c36be077cc6abf6ce51559df5b0c8cf415ca5c1097bb20dbe30b04a3
Notes SourceType-Working Papers-1
ObjectType-Working Paper/Pre-Print-1
content type line 50
OpenAccessLink https://www.biorxiv.org/content/10.1101/318857
PQID 2071209767
PQPubID 2050091
PageCount 39
ParticipantIDs biorxiv_primary_318857
proquest_journals_2071209767
PublicationCentury 2000
PublicationDate 20180511
PublicationDateYYYYMMDD 2018-05-11
PublicationDate_xml – month: 05
  year: 2018
  text: 20180511
  day: 11
PublicationDecade 2010
PublicationPlace Cold Spring Harbor
PublicationPlace_xml – name: Cold Spring Harbor
PublicationTitle bioRxiv
PublicationYear 2018
Publisher Cold Spring Harbor Laboratory Press
Cold Spring Harbor Laboratory
Publisher_xml – name: Cold Spring Harbor Laboratory Press
– name: Cold Spring Harbor Laboratory
References Morrison, Voorman, Johnson, Liu, Yu, Li, Muzny, Yu, Rice, Zhu, Bis, Heiss, O'Donnell, Psaty, Cupples, Gibbs, Boerwinkle (318857v1.12) 2013; 45
Patel, Kottyan, Lazaro, Williams, Ledbetter, Tromp, Rupert, Kohram, Wagner, Husami, Qian, Valencia, Zhang, Hostetter, Harley, Kaufman (318857v1.22) 2014; 5
Beecham, Bis, Martin, Choi, DeStefano, van Duijn, Fornage, Gabriel, Koboldt, Larson, Naj, Psaty, Salerno, Bush, Foroud, Wijsman, Farrer, Goate, Haines, Pericak-Vance, Boerwinkle, Mayeux, Seshadri, Schellenberg (318857v1.15) 2017
Wall, Tang, Zerbe, Kvale, Kwok, Schaefer, Risch (318857v1.23) 2014; 24
McKenna, Hanna, Banks, Sivachenko, Cibulskis, Kernytsky, Garimella, Altshuler, Gabriel, Daly, DePristo (318857v1.8) 2010; 20
Van der Auwera, Carneiro, Hartl, Poplin, Del Angel, Levy-Moonshine, Jordan, Shakir, Roazen, Thibault, Banks, Garimella, Altshuler, Gabriel, DePristo (318857v1.10) 2013; 43
Liu, Arias-Vasquez, Sleegers, Aulchenko, Kayser, Sanchez-Juan, Feng, Bertoli-Avella, van Swieten, Axenovich, Heutink, van Broeckhoven, Oostra, van Duijn (318857v1.16) 2007; 81
Schmieder, Edwards (318857v1.5) 2011; 27
Cantarel, Weaver, McNeill, Zhang, Mackey, Reese (318857v1.21) 2014; 15
Kunkle, Jaworski, Barral, Vardarajan, Beecham, Martin, Cantwell, Partch, Bird, Raskind, DeStefano, Carney, Cuccaro, Vance, Farrer, Goate, Foroud, Mayeux, Schellenberg, Haines, Pericak-Vance (318857v1.13) 2016; 12
Challis, Yu, Evani, Jackson, Paithankar, Coarfa, Milosavljevic, Gibbs, Yu (318857v1.11) 2012; 13
Guo, Ye, Sheng, Clark, Samuels (318857v1.3) 2014; 15
Nato, Chapman, Sohi, Nguyen, Brkanac, Wijsman (318857v1.17) 2015; 31
Abecasis, Altshuler, Auton, Brooks, Durbin, Gibbs, Hurles, McVean (318857v1.29) 2010; 467
Zook, Chapman, Wang, Mittelman, Hofmann, Hide, Salit (318857v1.20) 2014; 32
Patel, Jain (318857v1.4) 2012; 7
Trubetskoy, Rodriguez, Dave, Campbell, Crawford, Cook, Sutcliffe, Foster, Madduri, Cox, Davis (318857v1.19) 2015; 31
Barral, Cheng, Reitz, Vardarajan, Lee, Kunkle, Beecham, Cantwell, Pericak-Vance, Farrer, Haines, Goate, Foroud, Boerwinkle, Schellenberg, Mayeux (318857v1.14) 2015; 11
Carson, Smith, Matsui, Braekkan, Jepsen, Hansen, Frazer (318857v1.24) 2014; 15
O'Connell, Weeks (318857v1.18) 1998; 63
Guo, Zhao, Sheng, Ye, Li, Lehmann, Pietenpol, Samuels, Shyr (318857v1.7) 2014; 103
Cheung, Thompson, Wijsman (318857v1.28) 2013; 92
Lander, Green (318857v1.27) 1987; 84
Pareek, Smoczynski, Tretyn (318857v1.1) 2011; 52
Ewels, Magnusson, Lundin, Kaller (318857v1.26) 2016; 32
De Summa, Malerba, Pinto, Mori, Mijatovic, Tommasi (318857v1.25) 2017; 18
DePristo, Banks, Poplin, Garimella, Maguire, Hartl, Philippakis, del Angel, Rivas, Hanna, McKenna, Fennell, Kernytsky, Sivachenko, Cibulskis, Gabriel, Altshuler, Daly (318857v1.9) 2011; 43
Li, Zhan, Wing, Anderson, Kang, Abecasis (318857v1.6) 2013; 2013
Zhou, Su, Wang, Xu, Ning (318857v1.2) 2013; 8
References_xml – volume: 2013
  start-page: 865181
  year: 2013
  ident: 318857v1.6
  article-title: QPLOT: a quality assessment tool for next generation sequencing data
  publication-title: Biomed Res Int
– volume: 31
  start-page: 3790
  year: 2015
  end-page: 8
  ident: 318857v1.17
  article-title: PBAP: a pipeline for file processing and quality control of pedigree data with dense genetic markers
  publication-title: Bioinformatics
– volume: 15
  start-page: 104
  year: 2014
  ident: 318857v1.21
  article-title: BAYSIC: a Bayesian method for combining sets of genome variants with improved specificity and sensitivity
  publication-title: BMC Bioinformatics
– volume: 52
  start-page: 413
  year: 2011
  end-page: 35
  ident: 318857v1.1
  article-title: Sequencing technologies and genome sequencing
  publication-title: J Appl Genet
– volume: 81
  start-page: 17
  year: 2007
  end-page: 31
  ident: 318857v1.16
  article-title: A genomewide screen for late-onset Alzheimer disease in a genetically isolated Dutch population
  publication-title: Am J Hum Genet
– volume: 12
  start-page: 2
  year: 2016
  end-page: 10
  ident: 318857v1.13
  article-title: Genome-wide linkage analyses of non-Hispanic white families identify novel loci for familial late-onset Alzheimer’s disease
  publication-title: Alzheimers Dement
– volume: 15
  start-page: 879
  year: 2014
  end-page: 89
  ident: 318857v1.3
  article-title: Three-stage quality control strategies for DNA re-sequencing data
  publication-title: Brief Bioinform
– volume: 20
  start-page: 1297
  year: 2010
  end-page: 303
  ident: 318857v1.8
  article-title: The Genome Analysis Toolkit: a MapReduce framework for analyzing next-generation DNA sequencing data
  publication-title: Genome Res
– volume: 43
  start-page: 491
  year: 2011
  end-page: 8
  ident: 318857v1.9
  article-title: A framework for variation discovery and genotyping using next-generation DNA sequencing data
  publication-title: Nat Genet
– volume: 467
  start-page: 1061
  year: 2010
  end-page: 73
  ident: 318857v1.29
  article-title: A map of human genome variation from population-scale sequencing
  publication-title: Nature
– year: 2017
  ident: 318857v1.15
  article-title: The Alzheimer’s Disease Sequencing Project: study design and sample selection
– volume: 15
  start-page: 125
  year: 2014
  ident: 318857v1.24
  article-title: Effective filtering strategies to improve data quality from population-based whole exome sequencing studies
  publication-title: BMC Bioinformatics
– volume: 27
  start-page: 863
  year: 2011
  end-page: 4
  ident: 318857v1.5
  article-title: Quality control and preprocessing of metagenomic datasets
  publication-title: Bioinformatics
– volume: 43
  start-page: 11 10 1
  year: 2013
  end-page: 33
  ident: 318857v1.10
  article-title: From FastQ data to high confidence variant calls: the Genome Analysis Toolkit best practices pipeline
  publication-title: Curr Protoc Bioinformatics
– volume: 11
  start-page: 1397
  year: 2015
  end-page: 406
  ident: 318857v1.14
  article-title: Linkage analyses in Caribbean Hispanic families identify novel loci associated with familial late-onset Alzheimer’s disease
  publication-title: Alzheimers Dement
– volume: 18
  start-page: 119
  year: 2017
  ident: 318857v1.25
  article-title: GATK hard filtering: tunable parameters to improve variant calling for next generation sequencing targeted gene panel data
  publication-title: BMC Bioinformatics
– volume: 32
  start-page: 3047
  year: 2016
  end-page: 8
  ident: 318857v1.26
  article-title: MultiQC: summarize analysis results for multiple tools and samples in a single report
  publication-title: Bioinformatics
– volume: 84
  start-page: 2363
  year: 1987
  end-page: 7
  ident: 318857v1.27
  article-title: Construction of multilocus genetic linkage maps in humans
  publication-title: Proc Natl Acad Sci U S A
– volume: 7
  start-page: e30619
  year: 2012
  ident: 318857v1.4
  article-title: NGS QC Toolkit: a toolkit for quality control of next generation sequencing data
  publication-title: PLoS One
– volume: 24
  start-page: 1734
  year: 2014
  end-page: 9
  ident: 318857v1.23
  article-title: Estimating genotype error rates from high-coverage next-generation sequence data
  publication-title: Genome Res
– volume: 13
  start-page: 8
  year: 2012
  ident: 318857v1.11
  article-title: An integrative variant analysis suite for whole exome next-generation sequencing data
  publication-title: BMC Bioinformatics
– volume: 63
  start-page: 259
  year: 1998
  end-page: 66
  ident: 318857v1.18
  article-title: PedCheck: a program for identification of genotype incompatibilities in linkage analysis
  publication-title: Am J Hum Genet
– volume: 8
  start-page: e60234
  year: 2013
  ident: 318857v1.2
  article-title: QC-Chain: fast and holistic quality control method for next-generation sequencing data
  publication-title: PLoS One
– volume: 5
  start-page: 16
  year: 2014
  ident: 318857v1.22
  article-title: The struggle to find reliable results in exome sequencing data: filtering out Mendelian errors
  publication-title: Front Genet
– volume: 45
  start-page: 899
  year: 2013
  end-page: 901
  ident: 318857v1.12
  article-title: Whole-genome sequence-based analysis of high-density lipoprotein cholesterol
  publication-title: Nat Genet
– volume: 92
  start-page: 504
  year: 2013
  end-page: 16
  ident: 318857v1.28
  article-title: GIGI: an approach to effective imputation of dense genotypes on large pedigrees
  publication-title: Am J Hum Genet
– volume: 103
  start-page: 323
  year: 2014
  end-page: 8
  ident: 318857v1.7
  article-title: Multi-perspective quality control of Illumina exome sequencing data using QC3
  publication-title: Genomics
– volume: 32
  start-page: 246
  year: 2014
  end-page: 51
  ident: 318857v1.20
  article-title: Integrating human sequence data sets provides a resource of benchmark SNP and indel genotype calls
  publication-title: Nat Biotechnol
– volume: 31
  start-page: 187
  year: 2015
  end-page: 93
  ident: 318857v1.19
  article-title: Consensus Genotyper for Exome Sequencing (CGES): improving the quality of exome variant genotypes
  publication-title: Bioinformatics
SSID ssj0002961374
Score 1.5352676
SecondaryResourceType preprint
Snippet The Alzheimer's Disease Sequencing Project (ADSP) performed whole genome sequencing (WGS) of 584 subjects from 111 multiplex families at three sequencing...
The Alzheimer’s Disease Sequencing Project (ADSP) performed whole genome sequencing (WGS) of 584 subjects from 111 multiplex families at three sequencing...
SourceID biorxiv
proquest
SourceType Open Access Repository
Aggregation Database
SubjectTerms Alzheimer's disease
Filters
Gene deletion
Genetics
Genomes
Genotype & phenotype
Genotypes
Insertion
Neurodegenerative diseases
Nucleotide sequence
Pipelines
Quality control
SummonAdditionalLinks – databaseName: ProQuest Central
  dbid: BENPR
  link: http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwfV1LT9tAEF5BIiRupYCAApoDEieLjZ2s16eKJlDggCJIVW7RPsbCErFTJy2lv6M_mBl7Uw6VOO9qD7O7857vE-KkhypxJvGR9Z4ClKRvI-N7OrLMKeAzza1E3G1xq66-9W8eBg8h4bYIbZUrndgoal85zpFzJoTHPFOVfp7_iJg1iqurgUJjXXRJBWsKvrpfLm7Hd_-yLHFG5qqBYo5VRl8_loNAMERP8Yzes2a7tGGLqv5d_PpPHzdG5vKD6I7NHOstsYblR7HRskS-bIu_Lc7FCwzbtnKg4B-uA8wDiRWqHL5iWXE2dQE8LwKT5wqGpoHbhnEx55lzXqpq-M58uM32GcJ96KOGkVkaKEogbxDOn_48YjHD-nQBo7Z8s9rYHNembnbE5PJiMryKAplCZHWcRirPkURvM5coizJNnVPG5sohc7xkPh9Y6bTLyZ47M3Bcl7Y2lt5iIq3sm2RXdMqqxD0BhiEUJXrUie57k9MxSHGQREUKwFq5L3aDPKfzFjFj2gp6XxyuxDsNP2UxfbvXg_eXP4lNclY0V-57vUPRWdY_8YgcgqU9Drf-Cg-luC0
  priority: 102
  providerName: ProQuest
Title Quality Control and Integration of Genotypes from Two Calling Pipelines for Whole Genome Sequence Data in the Alzheimer's Disease Sequencing Project
URI https://www.proquest.com/docview/2071209767
https://www.biorxiv.org/content/10.1101/318857
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
link http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwdV3JTsNADB1BKyRurGIpyAeugcnSJD1CSykcqgqK6K2axRGRaFKlZSknvgKJ3-NL8CQDHBDnOI7kjMf2-I0fY0cuhr4Svnak1lSg-IF0hHZjRxpOAd2KDZTIoC36Ye82uBo1R7ZQnFlYpUzz4iV9Kvv4BrBNu2_l3Nw9oSUYN6NlVqd1FBiqhu7o-OdMxWtRcIoCSyH0K065rdX5Z8ctw0h3jdUHYorFOlvCbIOtVDyQi032Xk2yWEC7Ao4DlfdwaQc5kOEgT-ACs9ycl87A3AiB4XMObVEO1IZBOjW3ys2jvIA7w3hbik8QbixSGjpiLiDNgPI9OH14vcd0gsXn28cMOlWL5lu0VFgdz2yxYfd82O45ljDBkbEXOWGSIJlXtpQfSuRRpFQoZBIqNDwuLZ00JVexSihmK9FUpvcspce1RJ9LHgh_m9WyPMMdBsKMSeSoMfbjQIuE1CDVOhxDcnIp-S7bthYdT6upGOPK1Lus8W3gsfWG2dijPMajr4XR3n_v7bNVSkVi05d33QarzYtHPKBwP5eHrH523h9cH5b_-gtJ169c
linkProvider Cold Spring Harbor Laboratory Press
linkToHtml http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwtV1Lb9NAEB6VRBW9lUdFaYE9gDhZbGxnbR9QBUlLQksUQRC9WfsYq5aoHZy0JfwO_gb_kRk_4IDEreddjVazs_PamfkAng9QBVYHzjPOUYAShMbTbhB7hjEFXBJzKRFXW8zU5HP4_nx4vgW_ul4YLqvsdGKtqF1pOUfOmRBu84xUdLT85jFqFP-udhAajVic4uaGQrbV6-mY7veF758cL0YTr0UV8EzsR57KMqQzmMQGyqCMImuVNpmyyGAnicuGRtrYZmTYrB5a_qA1xpfOYCCNDHVAZO9APwwokulB_-3xbP7xT1LHT8g61pOffZWQpvHlsMUzIsl_Rc8nZjO4bfKy-p5f_6P-a5t2sgv9uV5idQ-2sLgP2w0o5eYB_GzGamzEqKliF7pwYtpOlaBbFGUm3mFRcvJ2Jbg9RSxuSjHS9XRvMc-X3OLOS2UlvjD8br39EsWntmxbjPVai7wQ5HyKN19_XGB-idXLlRg3v0Xdxppckyl6CIvb4PIe9IqywEcgNE9slOgwDuLQ6YzIIIVdEhXpG2PkPuy1_EyXzYCOtGH0Phx27E3bh7lK_4rR4_8vP4O7k8WHs_RsOjs9gB3yk2IuGhgMDqG3rq7wCfkia_O0lQAB6S3L3G-afvZ6
linkToPdf http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwdV3JTsMwELVYBOLGKpYCc-Ca4ixN0iNqKatQJYroLfIyEZEgqdKylBNfgcTv8SWMEwMHxDn2WJp4POOZ8XuMHbgY-kr42pFa0wXFD6QjtBs70nAK6HZsWolMt8VVeHoTnA9bQ5u6GNu2SpkV5Uv2VNXxTcM2nb61cXP3kLZg3IqaJjfdHOl0ls0bhDOznXvD5k9yxWuTl4oCyyX0O4-CXCv8z9Fb-ZPeMpvvixGWK2wG81W2UBNCTtfYew1pMYVO3UEOdM-HM4voQBqEIoUTzAuTOB2DeRoCg-cCOqJC1oZ-NjLPy82nooRbQ31bDX9AuLYt09AVEwFZDhT4wdH96x1mD1h-vn2MoVvXar6HVgLrPM06G_SOB51TxzInODL2IidMUyQ9y7byQ4k8ipQKhUxDhYbQpa3TluQqVik5byVayhShpfS4luhzyQPhb7C5vMhxk4EweIkcNcZ-HGiRkhikSw_HkKxdSr7FNqxGk1ENj5HUqt5ijW8FJ9YsxolHAY1Hq4XR9n_z9tliv9tLLs-uLnbYEoUnsanVu26DzU3KR9ylEGAi96rf_QUYibOq
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Quality+Control+and+Integration+of+Genotypes+from+Two+Calling+Pipelines+for+Whole+Genome+Sequence+Data+in+the+Alzheimer%E2%80%99s+Disease+Sequencing+Project&rft.jtitle=bioRxiv&rft.au=Naj%2C+Adam+C.&rft.au=Lin%2C+Honghuang&rft.au=Vardarajan%2C+Badri+N.&rft.au=White%2C+Simon&rft.date=2018-05-11&rft.pub=Cold+Spring+Harbor+Laboratory&rft.eissn=2692-8205&rft_id=info:doi/10.1101%2F318857&rft.externalDocID=318857v1
thumbnail_l http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=2692-8205&client=summon
thumbnail_m http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=2692-8205&client=summon
thumbnail_s http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=2692-8205&client=summon