Computational and analytical challenges in single-cell transcriptomics

Key Points Until recently, RNA profiling was limited to ensemble-based approaches, which average over bulk populations of cells. Technological advances in single-cell RNA sequencing (scRNA-seq) now enable the transcriptomes of large numbers of individual cells to be assayed in an unbiased manner. To...

Full description

Saved in:
Bibliographic Details
Published inNature reviews. Genetics Vol. 16; no. 3; pp. 133 - 145
Main Authors Stegle, Oliver, Teichmann, Sarah A., Marioni, John C.
Format Journal Article
LanguageEnglish
Published London Nature Publishing Group UK 01.03.2015
Nature Publishing Group
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:Key Points Until recently, RNA profiling was limited to ensemble-based approaches, which average over bulk populations of cells. Technological advances in single-cell RNA sequencing (scRNA-seq) now enable the transcriptomes of large numbers of individual cells to be assayed in an unbiased manner. To ensure that scRNA-seq data are fully exploited and interpreted correctly, it is important to apply appropriate computational and statistical approaches. Methods and principles previously developed for bulk RNA sequencing can be reused for this purpose; however, scRNA-seq data analysis poses several unique challenges that require new analytical strategies. At the experimental design stage, unique molecular identifiers and quantitative standards such as spike-ins need to be considered to allow accurate normalization and quality control of the raw data. Prior to using scRNA-seq data for biological discovery, it is important to consider both technical variability and confounding factors such as batch effects, the cell cycle or apoptosis. Computational methods that account for technical variation and remove confounding factors are beginning to emerge. The processed and normalized scRNA-seq data provide unique analysis opportunities that allow novel biological discoveries to be made. These include identification and characterization of cell types and the study of their organization in space and/or time; inference of gene regulatory networks and their robustness across individual cells; and characterization of the stochastic component of transcription. High-throughput RNA sequencing (RNA-seq) is a powerful method for transcriptome-wide analysis that has recently been applied to single cells. This Review discusses the analytical and computational challenges of processing and analysing single-cell RNA-seq data, paying special consideration to differences relative to the analysis of RNA-seq data generated from bulk cell populations and discussing how single-cell-specific biological insights can be obtained. The development of high-throughput RNA sequencing (RNA-seq) at the single-cell level has already led to profound new discoveries in biology, ranging from the identification of novel cell types to the study of global patterns of stochastic gene expression. Alongside the technological breakthroughs that have facilitated the large-scale generation of single-cell transcriptomic data, it is important to consider the specific computational and analytical challenges that still have to be overcome. Although some tools for analysing RNA-seq data from bulk cell populations can be readily applied to single-cell RNA-seq data, many new computational strategies are required to fully exploit this data type and to enable a comprehensive yet detailed study of gene expression at the single-cell level.
Bibliography:ObjectType-Article-2
SourceType-Scholarly Journals-1
ObjectType-Feature-3
content type line 23
ObjectType-Review-1
ObjectType-Article-1
ObjectType-Feature-2
ISSN:1471-0056
1471-0064
DOI:10.1038/nrg3833