Whole genome characterization of sequence diversity of 15,220 Icelanders

Understanding of sequence diversity is the cornerstone of analysis of genetic disorders, population genetics, and evolutionary biology. Here, we present an update of our sequencing set to 15,220 Icelanders who we sequenced to an average genome-wide coverage of 34X. We identified 39,020,168 autosomal...

Full description

Saved in:
Bibliographic Details
Published inScientific data Vol. 4; no. 1; p. 170115
Main Authors Jónsson, Hákon, Sulem, Patrick, Kehr, Birte, Kristmundsdottir, Snaedis, Zink, Florian, Hjartarson, Eirikur, Hardarson, Marteinn T., Hjorleifsson, Kristjan E., Eggertsson, Hannes P., Gudjonsson, Sigurjon Axel, Ward, Lucas D., Arnadottir, Gudny A., Helgason, Einar A., Helgason, Hannes, Gylfason, Arnaldur, Jonasdottir, Adalbjorg, Jonasdottir, Aslaug, Rafnar, Thorunn, Besenbacher, Soren, Frigge, Michael L., Stacey, Simon N., Magnusson, Olafur Th, Thorsteinsdottir, Unnur, Masson, Gisli, Kong, Augustine, Halldorsson, Bjarni V., Helgason, Agnar, Gudbjartsson, Daniel F., Stefansson, Kari
Format Journal Article
LanguageEnglish
Published London Nature Publishing Group UK 21.09.2017
Nature Publishing Group
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:Understanding of sequence diversity is the cornerstone of analysis of genetic disorders, population genetics, and evolutionary biology. Here, we present an update of our sequencing set to 15,220 Icelanders who we sequenced to an average genome-wide coverage of 34X. We identified 39,020,168 autosomal variants passing GATK filters: 31,079,378 SNPs and 7,940,790 indels. Calling de novo mutations (DNMs) is a formidable challenge given the high false positive rate in sequencing datasets relative to the mutation rate. Here we addressed this issue by using segregation of alleles in three-generation families. Using this transmission assay, we controlled the false positive rate and identified 108,778 high quality DNMs. Furthermore, we used our extended family structure and read pair tracing of DNMs to a panel of phased SNPs, to determine the parent of origin of 42,961 DNMs. Design Type(s) individual genetic characteristics comparison design • population genetics analysis objective Measurement Type(s) genetic sequence variation analysis Technology Type(s) whole genome sequencing Factor Type(s) Sample Characteristic(s) Homo sapiens • whole blood • buccal epithelium Machine-accessible metadata file describing the reported data (ISA-Tab format)
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 23
ObjectType-Undefined-3
H.J., P.S., B.V.H., A.H., D.F.G. and K.S. wrote the manuscript with input from S.N.S., U.T., G.M. and A.K. H.J., F.Z., E.H., M.T.H., K.E.H., E.A.H. and D.F.G. analyzed the data. H.J., B.K., S.K., F.Z., E.H., M.T.H., K.E.H., H.P.E., E.A.H., A.G. and D.F.G. created methods for analyzing the data. Adalbjorg. J., Aslaug. J. and O.Th.M. performed the experiments. S.A.G., L.D.W., G.A.A., H.H., T.R. and M.F. collected the samples and information. H.J., P.S., U.T., G.M., A.K., B.V.H., A.H., D.F.G. and K.S. designed the study.
ISSN:2052-4463
2052-4463
DOI:10.1038/sdata.2017.115