CSMD: a computational subtraction-based microbiome discovery pipeline for species-level characterization of clinical metagenomic samples

Abstract Motivation Microbiome analyses of clinical samples with low microbial biomass are challenging because of the very small quantities of microbial DNA relative to the human host, ubiquitous contaminating DNA in sequencing experiments and the large and rapidly growing microbial reference databa...

Full description

Saved in:
Bibliographic Details
Published inBioinformatics Vol. 36; no. 5; pp. 1577 - 1583
Main Authors Liu, Yu, Bible, Paul W, Zou, Bin, Liang, Qiaoxing, Dong, Cong, Wen, Xiaofeng, Li, Yan, Ge, Xiaofei, Li, Xifang, Deng, Xiuli, Ma, Rong, Guo, Shixin, Liang, Juanran, Chen, Tingting, Pan, Wenliang, Liu, Lixin, Chen, Wei, Wang, Xueqin, Wei, Lai
Format Journal Article
LanguageEnglish
Published England Oxford University Press 01.03.2020
Online AccessGet full text

Cover

Loading…
More Information
Summary:Abstract Motivation Microbiome analyses of clinical samples with low microbial biomass are challenging because of the very small quantities of microbial DNA relative to the human host, ubiquitous contaminating DNA in sequencing experiments and the large and rapidly growing microbial reference databases. Results We present computational subtraction-based microbiome discovery (CSMD), a bioinformatics pipeline specifically developed to generate accurate species-level microbiome profiles for clinical samples with low microbial loads. CSMD applies strategies for the maximal elimination of host sequences with minimal loss of microbial signal and effectively detects microorganisms present in the sample with minimal false positives using a stepwise convergent solution. CSMD was benchmarked in a comparative evaluation with other classic tools on previously published well-characterized datasets. It showed higher sensitivity and specificity in host sequence removal and higher specificity in microbial identification, which led to more accurate abundance estimation. All these features are integrated into a free and easy-to-use tool. Additionally, CSMD applied to cell-free plasma DNA showed that microbial diversity within these samples is substantially broader than previously believed. Availability and implementation CSMD is freely available at https://github.com/liuyu8721/csmd. Supplementary information Supplementary data are available at Bioinformatics online.
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 23
ISSN:1367-4803
1367-4811
1460-2059
1367-4811
DOI:10.1093/bioinformatics/btz790