Genotyping and Copy Number Analysis of Immunoglobin Heavy Chain Variable Genes Using Long Reads

One of the remaining challenges to describing an individual's genetic variation lies in the highly heterogeneous and complex genomic regions that impede the use of classical reference-guided mapping and assembly approaches. Once such region is the Immunoglobulin heavy chain locus (IGH), which i...

Full description

Saved in:
Bibliographic Details
Published iniScience Vol. 23; no. 3; p. 100883
Main Authors Ford, Michael, Haghshenas, Ehsan, Watson, Corey T., Sahinalp, S. Cenk
Format Journal Article
LanguageEnglish
Published United States Elsevier Inc 27.03.2020
Elsevier
Subjects
Online AccessGet full text
ISSN2589-0042
2589-0042
DOI10.1016/j.isci.2020.100883

Cover

Loading…
More Information
Summary:One of the remaining challenges to describing an individual's genetic variation lies in the highly heterogeneous and complex genomic regions that impede the use of classical reference-guided mapping and assembly approaches. Once such region is the Immunoglobulin heavy chain locus (IGH), which is critical for the development of antibodies and the adaptive immune system. We describe ImmunoTyper, the first PacBio-based genotyping and copy number calling tool specifically designed for IGH V genes (IGHV). We demonstrate that ImmunoTyper's multi-stage clustering and combinatorial optimization approach represents the most comprehensive IGHV genotyping approach published to date, through validation using gold-standard IGH reference sequence. This preliminary work establishes the feasibility of fine-grained genotype and copy number analysis using error-prone long reads in complex multi-gene loci and opens the door for in-depth investigation into IGHV heterogeneity using accessible and increasingly common whole-genome sequence. [Display omitted] •We describe ImmunoTyper, a WGS Immunoglobulin Heavy Chain Variable Genotyping tool•Immunotyper is the first such tool to use long reads and call alleles for pseudogenes•We demonstrate high allele call accuracy using simulated and real WGS data Biological Sciences; Bioinformatics; Computational Bioinformatics; Genomic Analysis
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 23
Lead Contact
ISSN:2589-0042
2589-0042
DOI:10.1016/j.isci.2020.100883