SPEAQeasy: a scalable pipeline for expression analysis and quantification for R/bioconductor-powered RNA-seq analyses

RNA sequencing (RNA-seq) is a common and widespread biological assay, and an increasing amount of data is generated with it. In practice, there are a large number of individual steps a researcher must perform before raw RNA-seq reads yield directly valuable information, such as differential gene exp...

Full description

Saved in:

Bibliographic Details
Published in	BMC bioinformatics Vol. 22; no. 1; p. 224
Main Authors	Eagles, Nicholas J, Burke, Emily E, Leonard, Jacob, Barry, Brianna K, Stolz, Joshua M, Huuki, Louise, Phan, BaDoi N, Serrato, Violeta Larios, Gutiérrez-Millán, Everardo, Aguilar-Ordoñez, Israel, Jaffe, Andrew E, Collado-Torres, Leonardo
Format	Journal Article
Language	English
Published	England BioMed Central Ltd 01.05.2021 BioMed Central BMC
Subjects	Annotations Bioconductor Bioinformatics Biological research Biology, Experimental Computer applications Computer programs Data processing Gene expression Gene sequencing Genomes High-Throughput Nucleotide Sequencing Methods Pipeline Pipelining (computers) Quality control R (Programming language) Ribonucleic acid RNA RNA processing RNA sequencing RNA-Seq Science Sequence Analysis, RNA Software Software development tools Usability Workflow United States RNA-seq Bioconductor Pipeline
Online Access	Get full text

Cover

Loading…

More Information
Summary:	RNA sequencing (RNA-seq) is a common and widespread biological assay, and an increasing amount of data is generated with it. In practice, there are a large number of individual steps a researcher must perform before raw RNA-seq reads yield directly valuable information, such as differential gene expression data. Existing software tools are typically specialized, only performing one step-such as alignment of reads to a reference genome-of a larger workflow. The demand for a more comprehensive and reproducible workflow has led to the production of a number of publicly available RNA-seq pipelines. However, we have found that most require computational expertise to set up or share among several users, are not actively maintained, or lack features we have found to be important in our own analyses. In response to these concerns, we have developed a Scalable Pipeline for Expression Analysis and Quantification (SPEAQeasy), which is easy to install and share, and provides a bridge towards R/Bioconductor downstream analysis solutions. SPEAQeasy is portable across computational frameworks (SGE, SLURM, local, docker integration) and different configuration files are provided ( http://research.libd.org/SPEAQeasy/ ). SPEAQeasy is user-friendly and lowers the computational-domain entry barrier for biologists and clinicians to RNA-seq data processing as the main input file is a table with sample names and their corresponding FASTQ files. The goal is to provide a flexible pipeline that is immediately usable by researchers, regardless of their technical background or computing environment.
Bibliography:	ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 23
ISSN:	1471-2105 1471-2105
DOI:	10.1186/s12859-021-04142-3