Accurate, scalable and integrative haplotype estimation

The number of human genomes being genotyped or sequenced increases exponentially and efficient haplotype estimation methods able to handle this amount of data are now required. Here we present a method, SHAPEIT4, which substantially improves upon other methods to process large genotype and high cove...

Full description

Saved in:
Bibliographic Details
Published inNature communications Vol. 10; no. 1; pp. 5436 - 10
Main Authors Delaneau, Olivier, Zagury, Jean-François, Robinson, Matthew R., Marchini, Jonathan L., Dermitzakis, Emmanouil T.
Format Journal Article
LanguageEnglish
Published London Nature Publishing Group UK 28.11.2019
Nature Publishing Group
Nature Portfolio
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:The number of human genomes being genotyped or sequenced increases exponentially and efficient haplotype estimation methods able to handle this amount of data are now required. Here we present a method, SHAPEIT4, which substantially improves upon other methods to process large genotype and high coverage sequencing datasets. It notably exhibits sub-linear running times with sample size, provides highly accurate haplotypes and allows integrating external phasing information such as large reference panels of haplotypes, collections of pre-phased variants and long sequencing reads. We provide SHAPEIT4 in an open source format and demonstrate its performance in terms of accuracy and running times on two gold standard datasets: the UK Biobank data and the Genome In A Bottle. Haplotype information inferred by phasing is useful in genetic and genomic analysis. Here, the authors develop SHAPEIT4, a phasing method that exhibits sub-linear running time, provides accurate haplotypes and enables integration of external phasing information.
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 23
ISSN:2041-1723
2041-1723
DOI:10.1038/s41467-019-13225-y