CAPG: comprehensive allopolyploid genotyper

Abstract Motivation Genotyping by sequencing is a powerful tool for investigating genetic variation in plants, but many economically important plants are allopolyploids, where homoeologous similarity obscures the subgenomic origin of reads and confounds allelic and homoeologous SNPs. Recent polyploi...

Full description

Saved in:
Bibliographic Details
Published inBioinformatics (Oxford, England) Vol. 39; no. 1
Main Authors Kulkarni, Roshan, Zhang, Yudi, Cannon, Steven B, Dorman, Karin S
Format Journal Article
LanguageEnglish
Published England Oxford University Press 01.01.2023
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:Abstract Motivation Genotyping by sequencing is a powerful tool for investigating genetic variation in plants, but many economically important plants are allopolyploids, where homoeologous similarity obscures the subgenomic origin of reads and confounds allelic and homoeologous SNPs. Recent polyploid genotyping methods use allelic frequencies, rate of heterozygosity, parental cross or other information to resolve read assignment, but good subgenomic references offer the most direct information. The typical strategy aligns reads to the joint reference, performs diploid genotyping within each subgenome, and filters the results, but persistent read misassignment results in an excess of false heterozygous calls. Results We introduce the Comprehensive Allopolyploid Genotyper (CAPG), which formulates an explicit likelihood to weight read alignments against both subgenomic references and genotype individual allopolyploids from whole-genome resequencing data. We demonstrate CAPG in allotetraploids, where it performs better than Genome Analysis Toolkit’s HaplotypeCaller applied to reads aligned to the combined subgenomic references. Availability and implementation Code and tutorials are available at https://github.com/Kkulkarni1/CAPG.git. Supplementary information Supplementary data are available at Bioinformatics online.
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 23
SC0014664
USDOE Office of Science (SC)
ISSN:1367-4811
1367-4803
1367-4811
DOI:10.1093/bioinformatics/btac729