SNVPhyl: a single nucleotide variant phylogenomics pipeline for microbial genomic epidemiology

The recent widespread application of whole-genome sequencing (WGS) for microbial disease investigations has spurred the development of new bioinformatics tools, including a notable proliferation of phylogenomics pipelines designed for infectious disease surveillance and outbreak investigation. Trans...

Full description

Saved in:
Bibliographic Details
Published inMicrobial genomics Vol. 3; no. 6; p. e000116
Main Authors Petkau, Aaron, Mabon, Philip, Sieffert, Cameron, Knox, Natalie C, Cabral, Jennifer, Iskander, Mariam, Iskander, Mark, Weedmark, Kelly, Zaheer, Rahat, Katz, Lee S, Nadon, Celine, Reimer, Aleisha, Taboada, Eduardo, Beiko, Robert G, Hsiao, William, Brinkman, Fiona, Graham, Morag, Van Domselaar, Gary
Format Journal Article
LanguageEnglish
Published England Microbiology Society 30.06.2017
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:The recent widespread application of whole-genome sequencing (WGS) for microbial disease investigations has spurred the development of new bioinformatics tools, including a notable proliferation of phylogenomics pipelines designed for infectious disease surveillance and outbreak investigation. Transitioning the use of WGS data out of the research laboratory and into the front lines of surveillance and outbreak response requires user-friendly, reproducible and scalable pipelines that have been well validated. Single Nucleotide Variant Phylogenomics (SNVPhyl) is a bioinformatics pipeline for identifying high-quality single-nucleotide variants (SNVs) and constructing a whole-genome phylogeny from a collection of WGS reads and a reference genome. Individual pipeline components are integrated into the Galaxy bioinformatics framework, enabling data analysis in a user-friendly, reproducible and scalable environment. We show that SNVPhyl can detect SNVs with high sensitivity and specificity, and identify and remove regions of high SNV density (indicative of recombination). SNVPhyl is able to correctly distinguish outbreak from non-outbreak isolates across a range of variant-calling settings, sequencing-coverage thresholds or in the presence of contamination. SNVPhyl is available as a Galaxy workflow, Docker and virtual machine images, and a Unix-based command-line application. SNVPhyl is released under the Apache 2.0 license and available at http://snvphyl.readthedocs.io/ or at https://github.com/phac-nml/snvphyl-galaxy.
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 23
All supporting data, code and protocols have been provided within the article orthrough supplementary data files. Two supplementary figures and three supplementary tables are available with the online Supplementary Material.
ISSN:2057-5858
2057-5858
DOI:10.1099/mgen.0.000116