Enhancing DNA barcode reference libraries by harvesting terrestrial arthropods at the Smithsonian's National Museum of Natural History

The use of DNA barcoding has revolutionised biodiversity science, but its application depends on the existence of comprehensive and reliable reference libraries. For many poorly known taxa, such reference sequences are missing even at higher-level taxonomic scales. We harvested the collections of th...

Full description

Saved in:
Bibliographic Details
Published inBiodiversity data journal Vol. 11; p. e100904
Main Authors Santos, Bernardo F, Miller, Meredith E, Miklasevskaja, Margarita, McKeown, Jaclyn T A, Redmond, Niamh E, Coddington, Jonathan A, Bird, Jessica, Miller, Scott E, Smith, Ashton, Brady, Seán G, Buffington, Matthew L, Chamorro, M Lourdes, Dikow, Torsten, Gates, Michael W, Goldstein, Paul, Konstantinov, Alexander, Kula, Robert, Silverson, Nicholas D, Solis, M Alma, deWaard, Stephanie L, Naik, Suresh, Nikolova, Nadya, Pentinsaari, Mikko, Prosser, Sean W J, Sones, Jayme E, Zakharov, Evgeny V, deWaard, Jeremy R
Format Journal Article
LanguageEnglish
Published Bulgaria Pensoft Publishers 24.04.2023
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:The use of DNA barcoding has revolutionised biodiversity science, but its application depends on the existence of comprehensive and reliable reference libraries. For many poorly known taxa, such reference sequences are missing even at higher-level taxonomic scales. We harvested the collections of the Smithsonian's National Museum of Natural History (USNM) to generate DNA barcoding sequences for genera of terrestrial arthropods previously not recorded in one or more major public sequence databases. Our workflow used a mix of Sanger and Next-Generation Sequencing (NGS) approaches to maximise sequence recovery while ensuring affordable cost. In total, COI sequences were obtained for 5,686 specimens belonging to 3,737 determined species in 3,886 genera and 205 families distributed in 137 countries. Success rates varied widely according to collection data and focal taxon. NGS helped recover sequences of specimens that failed a previous run of Sanger sequencing. Success rates and the optimal balance between Sanger and NGS are the most important drivers to maximise output and minimise cost in future projects. The corresponding sequence and taxonomic data can be accessed through the Barcode of Life Data System, GenBank, the Global Biodiversity Information Facility, the Global Genome Biodiversity Network Data Portal and the NMNH data portal.
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 23
Academic editor: Rodolphe Rougerie
ISSN:1314-2828
1314-2836
1314-2828
DOI:10.3897/BDJ.11.E100904