CSMD: a computational subtraction-based microbiome discovery pipeline for species-level characterization of clinical metagenomic samples
Abstract Motivation Microbiome analyses of clinical samples with low microbial biomass are challenging because of the very small quantities of microbial DNA relative to the human host, ubiquitous contaminating DNA in sequencing experiments and the large and rapidly growing microbial reference databa...
Saved in:
Published in | Bioinformatics Vol. 36; no. 5; pp. 1577 - 1583 |
---|---|
Main Authors | , , , , , , , , , , , , , , , , , , |
Format | Journal Article |
Language | English |
Published |
England
Oxford University Press
01.03.2020
|
Online Access | Get full text |
Cover
Loading…
Summary: | Abstract
Motivation
Microbiome analyses of clinical samples with low microbial biomass are challenging because of the very small quantities of microbial DNA relative to the human host, ubiquitous contaminating DNA in sequencing experiments and the large and rapidly growing microbial reference databases.
Results
We present computational subtraction-based microbiome discovery (CSMD), a bioinformatics pipeline specifically developed to generate accurate species-level microbiome profiles for clinical samples with low microbial loads. CSMD applies strategies for the maximal elimination of host sequences with minimal loss of microbial signal and effectively detects microorganisms present in the sample with minimal false positives using a stepwise convergent solution. CSMD was benchmarked in a comparative evaluation with other classic tools on previously published well-characterized datasets. It showed higher sensitivity and specificity in host sequence removal and higher specificity in microbial identification, which led to more accurate abundance estimation. All these features are integrated into a free and easy-to-use tool. Additionally, CSMD applied to cell-free plasma DNA showed that microbial diversity within these samples is substantially broader than previously believed.
Availability and implementation
CSMD is freely available at https://github.com/liuyu8721/csmd.
Supplementary information
Supplementary data are available at Bioinformatics online. |
---|---|
Bibliography: | ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 23 |
ISSN: | 1367-4803 1367-4811 1460-2059 1367-4811 |
DOI: | 10.1093/bioinformatics/btz790 |