Sequence clustering threshold has little effect on the recovery of microbial community structure

Analysis of microbial community structure by multivariate ordination methods, using data obtained by high‐throughput sequencing of amplified markers (i.e., DNA metabarcoding), often requires clustering of DNA sequences into operational taxonomic units (OTUs). Parameters for the clustering procedure...

Full description

Saved in:
Bibliographic Details
Published inMolecular ecology resources Vol. 18; no. 5; pp. 1064 - 1076
Main Authors Botnen, Synnøve Smebye, Davey, Marie Louise, Halvorsen, Rune, Kauserud, Håvard
Format Journal Article
LanguageEnglish
Norwegian
Published England Wiley Subscription Services, Inc 01.09.2018
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:Analysis of microbial community structure by multivariate ordination methods, using data obtained by high‐throughput sequencing of amplified markers (i.e., DNA metabarcoding), often requires clustering of DNA sequences into operational taxonomic units (OTUs). Parameters for the clustering procedure tend not to be justified but are set by tradition rather than being based on explicit knowledge. In this study, we explore the extent to which ordination results are affected by variation in parameter settings for the clustering procedure. Amplicon sequence data from nine microbial community studies, representing different sampling designs, spatial scales and ecosystems, were subjected to clustering into OTUs at seven different similarity thresholds (clustering thresholds) ranging from 87% to 99% sequence similarity. The 63 data sets thus obtained were subjected to parallel DCA and GNMDS ordinations. The resulting community structures were highly similar across all clustering thresholds. We explain this pattern by the existence of strong ecological structuring gradients and phylogenetically diverse sets of abundant OTUs that are highly stable across clustering thresholds. Removing low‐abundance, rare OTUs had negligible effects on community patterns. Our results indicate that microbial data sets with a clear gradient structure are highly robust to choice of sequence clustering threshold.
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 23
EI/UiO
ISSN:1755-098X
1755-0998
DOI:10.1111/1755-0998.12894