Whole-exome sequencing to analyze population structure, parental inbreeding, and familial linkage

Principal component analysis (PCA), homozygosity rate estimations, and linkage studies in humans are classically conducted through genome-wide single-nucleotide variant arrays (GWSA). We compared whole-exome sequencing (WES) and GWSA for this purpose. We analyzed 110 subjects originating from differ...

Full description

Saved in:
Bibliographic Details
Published inProceedings of the National Academy of Sciences - PNAS Vol. 113; no. 24; pp. 6713 - 6718
Main Authors Belkadi, Aziz, Pedergnana, Vincent, Cobat, Aurélie, Itan, Yuval, Vincent, Quentin B., Abhyankar, Avinash, Shang, Lei, El Baghdadi, Jamila, Bousfiha, Aziz, Alcais, Alexandre, Boisson, Bertrand, Casanova, Jean-Laurent, Abel, Laurent
Format Journal Article
LanguageEnglish
Published United States National Academy of Sciences 14.06.2016
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:Principal component analysis (PCA), homozygosity rate estimations, and linkage studies in humans are classically conducted through genome-wide single-nucleotide variant arrays (GWSA). We compared whole-exome sequencing (WES) and GWSA for this purpose. We analyzed 110 subjects originating from different regions of the world, including North Africa and the Middle East, which are poorly covered by public databases and have high consanguinity rates. We tested and applied a number of quality control (QC) filters. Comparedwith GWSA, we found that WES provided an accurate prediction of population substructure using variants with a minor allele frequency > 2% (correlation = 0.89 with the PCA coordinates obtained by GWSA). WES also yielded highly reliable estimates of homozygosity rates using runs of homozygosity with a 1,000-kb window (correlation = 0.94 with the estimates provided by GWSA). Finally, homozygosity mapping analyses in 15 families including a single offspring with high homozygosity rates showed that WES provided 51% less genome- wide linkage information than GWSA overall but 97% more information for the coding regions. At the genome-wide scale, 76.3% of linked regions were found by both GWSA and WES, 17.7% were found by GWSA only, and 6.0% were found by WES only. For coding regions, the corresponding percentages were 83.5%, 7.4%, and 9.1%, respectively. With appropriate QC filters, WES can be used for PCA and adjustment for population substructure, estimating homozygosity rates in individuals, and powerful linkage analyses, particularly in coding regions.
Bibliography:SourceType-Scholarly Journals-1
ObjectType-Feature-1
content type line 14
ObjectType-Article-2
content type line 23
PMCID: PMC4914194
Author contributions: A. Belkadi, V.P., A.C., Y.I., A. Alcais, J.-L.C., and L.A. designed research; A. Belkadi, V.P., A.C., Y.I., Q.B.V., A. Abhyankar, L.S., J.E.B., A. Bousfiha, E./A.C., and B.B. performed research; A. Belkadi, V.P., A.C., Y.I., Q.B.V., A. Abhyankar, and B.B. contributed new reagents/analytic tools; A. Belkadi, V.P., A.C., Y.I., Q.B.V., A. Abhyankar, L.S., A. Alcais, B.B., J.-L.C., and L.A. analyzed data; A. Belkadi, V.P., A.C., Y.I., A. Alcais, J.-L.C., and L.A. wrote the paper; and J.E.B., A. Bousfiha, and E./A.C. provided clinical data.
Reviewers: L.B.B., Centre Hospitalo-Universitaire (CHU) Sainte-Justine/University of Montreal; and J.F., Ecole Polytechnique Fédérale de Lausanne (EPFL).
1A. Belkadi and V.P. contributed equally to this work.
Contributed by Jean-Laurent Casanova, April 27, 2016 (sent for review March 3, 2016; reviewed by Luis B. Barreiro and Jacques Fellay)
ISSN:0027-8424
1091-6490
1091-6490
DOI:10.1073/pnas.1606460113