How tool combinations in different pipeline versions affect the outcome in RNA-seq analysis

Data analysis tools are continuously changed and improved over time. In order to test how these changes influence the comparability between analyses, the output of different workflow options of the nf-core/rnaseq pipeline were compared. Five different pipeline settings (STAR+Salmon, STAR+RSEM, STAR+...

Full description

Saved in:
Bibliographic Details
Published inNAR genomics and bioinformatics Vol. 6; no. 1; p. lqae020
Main Authors Perelo, Louisa Wessels, Gabernet, Gisela, Straub, Daniel, Nahnsen, Sven
Format Journal Article
LanguageEnglish
Published England Oxford University Press 01.03.2024
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:Data analysis tools are continuously changed and improved over time. In order to test how these changes influence the comparability between analyses, the output of different workflow options of the nf-core/rnaseq pipeline were compared. Five different pipeline settings (STAR+Salmon, STAR+RSEM, STAR+featureCounts, HISAT2+featureCounts, pseudoaligner Salmon) were run on three datasets (human, Arabidopsis, zebrafish) containing spike-ins of the External RNA Control Consortium (ERCC). Fold change ratios and differential expression of genes and spike-ins were used for comparative analyses of the different tools and versions settings of the pipeline. An overlap of 85% for differential gene classification between pipelines could be shown. Genes interpreted with a bias were mostly those present at lower concentration. Also, the number of isoforms and exons per gene were determinants. Previous pipeline versions using featureCounts showed a higher sensitivity to detect one-isoform genes like ERCC. To ensure data comparability in long-term analysis series it would be recommendable to either stay with the pipeline version the series was initialized with or to run both versions during a transition time in order to ensure that the target genes are addressed the same way.
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
content type line 23
ISSN:2631-9268
2631-9268
DOI:10.1093/nargab/lqae020