Intra‐individual polymorphism in chloroplasts from NGS data: where does it come from and how to handle it?

Next‐generation sequencing allows access to a large quantity of genomic data. In plants, several studies used whole chloroplast genome sequences for inferring phylogeography or phylogeny. Even though the chloroplast is a haploid organelle, NGS plastome data identified a nonnegligible number of intra...

Full description

Saved in:

Bibliographic Details
Published in	Molecular ecology resources Vol. 16; no. 2; pp. 434 - 445
Main Authors	Scarcelli, N., Mariac, C., Couvreur, T. L. P., Faye, A., Richard, D., Sabot, F., Berthouly-Salazar, C., Vigouroux, Y.
Format	Journal Article
Language	English
Published	England Blackwell Pub 01.03.2016 Blackwell Publishing Ltd Wiley Subscription Services, Inc Wiley/Blackwell
Subjects	Chloroplast Chloroplasts Chloroplasts - genetics Computational Biology DNA, Chloroplast - chemistry DNA, Chloroplast - genetics gatk Genetics Genome, Chloroplast Genomes Genotype High-Throughput Nucleotide Sequencing intra-individual polymorphism Life Sciences NGS Plants genetics Polymorphism, Genetic Polymorphism, Single Nucleotide samtools SNP gatk NGS SNP samtools Chloroplast intra-individual polymorphism
Online Access	Get full text

Cover

Loading…

More Information
Summary:	Next‐generation sequencing allows access to a large quantity of genomic data. In plants, several studies used whole chloroplast genome sequences for inferring phylogeography or phylogeny. Even though the chloroplast is a haploid organelle, NGS plastome data identified a nonnegligible number of intra‐individual polymorphic SNPs. Such observations could have several causes such as sequencing errors, the presence of heteroplasmy or transfer of chloroplast sequences in the nuclear and mitochondrial genomes. The occurrence of allelic diversity has practical important impacts on the identification of diversity, the analysis of the chloroplast data and beyond that, significant evolutionary questions. In this study, we show that the observed intra‐individual polymorphism of chloroplast sequence data is probably the result of plastid DNA transferred into the mitochondrial and/or the nuclear genomes. We further assess nine different bioinformatics pipelines’ error rates for SNP and genotypes calling using SNPs identified in Sanger sequencing. Specific pipelines are adequate to deal with this issue, optimizing both specificity and sensitivity. Our results will allow a proper use of whole chloroplast NGS sequence and will allow a better handling of NGS chloroplast sequence diversity.
Bibliography:	http://dx.doi.org/10.1111/1755-0998.12462 Agropolis Foundation - No. ANR-10-LABX-0001-01 ark:/67375/WNG-6LN1K7PD-L istex:FEAB7BB6E6A1D8A7555A7E69B4A58C3C57090B89 Agence Nationale de Recherche - No. AFRICROP ANR-13-BSV7-0017 Table S1 List of GenBank (.fasta), NCBI-SRA (.fastq) and DRYAD (.fastq, .bam and .vcf) ID resources used in this paper.Table S2 Scripts used to prepare raw data, map to the reference and call SNPs (12 different methods). List of adaptors and tags used during the libraries' construction.Table S3 Number of reads mapping on the chloroplast, mitochondrial and nuclear genomes for the rice sample.Table S4 Total number of SNP [1,1] and polymorphic positions [0,1] observed for each species and for each calling method. ArticleID:MEN12462 ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 23
ISSN:	1755-098X 1755-0998
DOI:	10.1111/1755-0998.12462