Application of whole genome sequence data in analyzing the molecular epidemiology of Shiga toxin-producing Escherichia coli O157:H7/H

Seventeen clusters of Shiga toxin-producing Escherichia coli O157:H7/- (O157) strains, determined by cluster analysis of pulsed-field gel electrophoresis patterns, were analyzed using whole genome sequence (WGS) data to investigate this pathogen's molecular epidemiology. The 17 clusters include...

Full description

Saved in:

Bibliographic Details
Published in	International journal of food microbiology Vol. 264; pp. 39 - 45
Main Authors	Yokoyama, Eiji, Hirai, Shinichiro, Ishige, Taichiro, Murakami, Satoshi
Format	Journal Article
Language	English
Published	Netherlands Elsevier B.V 02.01.2018 Elsevier BV
Subjects	Backbone Cluster analysis Clusters Data processing Disease Outbreaks DNA, Intergenic - genetics E coli Electrophoresis, Gel, Pulsed-Field Epidemiology Escherichia coli Escherichia coli Infections - epidemiology Escherichia coli Infections - microbiology Escherichia coli O157 - genetics Escherichia coli O157 - isolation & purification Escherichia coli O157 - pathogenicity Gel electrophoresis Gene mapping Genes Genome, Bacterial - genetics Genomes Genotype Genotypes Humans Identification methods Mapping Molecular Epidemiology Nucleotide sequence Outbreaks Polymorphism Polymorphism, Single Nucleotide - genetics Pulsed-field gel electrophoresis Shiga toxin Shiga Toxin - biosynthesis Shiga toxin producing Escherichia coli O157:H7 Single nucleotide polymorphism Strain Strains (organisms) Whole genome sequence Molecular epidemiology Whole genome sequence Single nucleotide polymorphism Shiga toxin producing Escherichia coli O157:H7
Online Access	Get full text

Cover

Loading…

More Information
Summary:	Seventeen clusters of Shiga toxin-producing Escherichia coli O157:H7/- (O157) strains, determined by cluster analysis of pulsed-field gel electrophoresis patterns, were analyzed using whole genome sequence (WGS) data to investigate this pathogen's molecular epidemiology. The 17 clusters included 136 strains containing strains from nine outbreaks, with each outbreak caused by a single source contaminated with the organism, as shown by epidemiological contact surveys. WGS data of these strains were used to identify single nucleotide polymorphisms (SNPs) by two methods: short read data were directly mapped to a reference genome (mapping derived SNPs) and common SNPs between the mapping derived SNPs and SNPs in assembled data of short read data (common SNPs). Among both SNPs, those that were detected in genes with a gap were excluded to remove ambiguous SNPs from further analysis. The effectiveness of both SNPs was investigated among all the concatenated SNPs that were detected (whole SNP set); SNPs were divided into three categories based on the genes in which they were located (i.e., backbone SNP set, O-island SNP set, and mobile element SNP set); and SNPs in non-coding regions (intergenic region SNP set). When SNPs from strains isolated from the nine single source derived outbreaks were analyzed using an unweighted pair group method with arithmetic mean tree (UPGMA) and a minimum spanning tree (MST), the maximum pair-wise distances of the backbone SNP set of the mapping derived SNPs were significantly smaller than those of the whole and intergenic region SNP set on both UPGMAs and MSTs. This significant difference was also observed when the backbone SNP set of the common SNPs were examined (Steel-Dwass test, P≤0.01). When the maximum pair-wise distances were compared between the mapping derived and common SNPs, significant differences were observed in those of the whole, mobile element, and intergenic region SNP set (Wilcoxon signed rank test, P≤0.01). When all the strains included in one complex on an MST or one cluster on a UPGMA were designated as the same genotype, the values of the Hunter-Gaston Discriminatory Power Index for the backbone SNP set of the mapping derived and common SNPs were higher than those of other SNP sets. In contrast, the mobile element SNP set could not robustly subdivide lineage I strains of tested O157 strains using both the mapping derived and common SNPs. These results suggested that the backbone SNP set were the most effective for analysis of WGS data for O157 in enabling an appropriation of its molecular epidemiology. •A total of 136 Shiga toxin-producing Escherichia coli O157 strains were investigated.•Whole genome sequence data could subdivide PFGE clusters of STEC O157.•SNPs in backbone regions were most effective in subdividing those clusters.•SNPs in mobile element regions could not robustly subdivide lineage I strains.•SNPs in non-coding regions were inferior to other SNPs in identifying an outbreak.
Bibliography:	ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 23
ISSN:	0168-1605 1879-3460
DOI:	10.1016/j.ijfoodmicro.2017.10.019