MARES, a replicable pipeline and curated reference database for marine eukaryote metabarcoding

The use of DNA metabarcoding to characterise the biodiversity of environmental and community samples has exploded in recent years. However, taxonomic inferences from these studies are contingent on the quality and completeness of the sequence reference database used to characterise sample species-co...

Full description

Saved in:
Bibliographic Details
Published inScientific data Vol. 7; no. 1; p. 209
Main Authors Arranz, Vanessa, Pearman, William S, Aguirre, J David, Liggins, Libby
Format Journal Article
LanguageEnglish
Published England Nature Publishing Group 03.07.2020
Nature Publishing Group UK
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:The use of DNA metabarcoding to characterise the biodiversity of environmental and community samples has exploded in recent years. However, taxonomic inferences from these studies are contingent on the quality and completeness of the sequence reference database used to characterise sample species-composition. In response, studies often develop custom reference databases to improve species assignment. The disadvantage of this approach is that it limits the potential for database re-use, and the transferability of inferences across studies. Here, we present the MARine Eukaryote Species (MARES) reference database for use in marine metabarcoding studies, created using a transparent and reproducible pipeline. MARES includes all COI sequences available in GenBank and BOLD for marine taxa, unified into a single taxonomy. Our pipeline facilitates the curation of sequences, synonymization of taxonomic identifiers used by different repositories, and formatting these data for use in taxonomic assignment tools. Overall, MARES provides a benchmark COI reference database for marine eukaryotes, and a standardised pipeline for (re)producing reference databases enabling integration and fair comparison of marine DNA metabarcoding results.
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 23
ObjectType-Undefined-3
ISSN:2052-4463
2052-4463
DOI:10.1038/s41597-020-0549-9