HybPiper: Extracting Coding Sequence and Introns for Phylogenetics from High-Throughput Sequencing Reads Using Target Enrichment

Premise of the study: Using sequence data generated via target enrichment for phylogenetics requires reassembly of high-throughput sequence reads into loci, presenting a number of bioinformatics challenges. We developed HybPiper as a user-friendly platform for assembly of gene regions, extraction of...

Full description

Saved in:
Bibliographic Details
Published inApplications in plant sciences Vol. 4; no. 7
Main Authors Johnson, Matthew G, Gardner, Elliot M, Liu, Yang, Medina, Rafael, Goffinet, Bernard, Shaw, A. Jonathan, Zerega, Nyree J. C, Wickett, Norman J
Format Journal Article
LanguageEnglish
Published United States Botanical Society of America 01.07.2016
John Wiley & Sons, Inc
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:Premise of the study: Using sequence data generated via target enrichment for phylogenetics requires reassembly of high-throughput sequence reads into loci, presenting a number of bioinformatics challenges. We developed HybPiper as a user-friendly platform for assembly of gene regions, extraction of exon and intron sequences, and identification of paralogous gene copies. We test HybPiper using baits designed to target 333 phylogenetic markers and 125 genes of functional significance in Artocarpus (Moraceae). Methods and Results: HybPiper implements parallel execution of sequence assembly in three phases: read mapping, contig assembly, and target sequence extraction. The pipeline was able to recover nearly complete gene sequences for all genes in 22 species of Artocarpus. HybPiper also recovered more than 500 bp of nontargeted intron sequence in over half of the phylogenetic markers and identified paralogous gene copies in Artocarpus. Conclusions: HybPiper was designed for Linux and Mac OS X and is freely available at https://github.com/mossmatters/HybPiper.
Bibliography:www.datadryad.org
www.artocarpusresearch.org
We would like to thank A. DeVault at MycroArray for assistance optimizing the target enrichment protocol, and the Field Museum for use of its DNA sequencers. The authors thank B. Faircloth and two anonymous reviewers for helpful comments on an earlier version of the manuscript. This research was funded by National Science Foundation grants to A.J.S. (DEB‐1239980), B.G. (DEB‐1240045 and DEB‐1146295), N.J.W. (DEB‐1239992), and N.J.C.Z. (DEB‐0919119), and by a grant from the Northwestern University Institute for Sustainability and Energy (N.J.C.Z.). Data generated for this study can be found at
(
and the NCBI Sequence Read Archive (SRA; BioProject PRJNA301299).
,
http://dx.doi.org/10.5061/dryad.3293r
ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
content type line 23
We would like to thank A. DeVault at MycroArray for assistance optimizing the target enrichment protocol, and the Field Museum for use of its DNA sequencers. The authors thank B. Faircloth and two anonymous reviewers for helpful comments on an earlier version of the manuscript. This research was funded by National Science Foundation grants to A.J.S. (DEB-1239980), B.G. (DEB-1240045 and DEB-1146295), N.J.W. (DEB-1239992), and N.J.C.Z. (DEB-0919119), and by a grant from the Northwestern University Institute for Sustainability and Energy (N.J.C.Z.). Data generated for this study can be found at www.artocarpusresearch.org, www.datadryad.org (http://dx.doi.org/10.5061/dryad.3293r), and the NCBI Sequence Read Archive (SRA; BioProject PRJNA301299).
ISSN:2168-0450
2168-0450
DOI:10.3732/apps.1600016