Dropout-based feature selection for scRNASeq

Features selection is a key step in many single-cell RNASeq (scRNASeq) analyses. Feature selection is intended to preserve biologically relevant information while removing genes only subject to technical noise. As it is frequently performed prior to dimensionality reduction, clustering and pseudotim...

Full description

Saved in:

Bibliographic Details
Published in	bioRxiv
Main Authors	Andrews, Tallulah S, Hemberg, Martin
Format	Paper
Language	English
Published	Cold Spring Harbor Cold Spring Harbor Laboratory Press 17.05.2018 Cold Spring Harbor Laboratory
Edition	1.4
Subjects	Bioinformatics Product acceptance Production planning dropouts single-cell RNASeq feature selection
Online Access	Get full text
ISSN	2692-8205 2692-8205
DOI	10.1101/065094

Cover

More Information
Summary:	Features selection is a key step in many single-cell RNASeq (scRNASeq) analyses. Feature selection is intended to preserve biologically relevant information while removing genes only subject to technical noise. As it is frequently performed prior to dimensionality reduction, clustering and pseudotime analyses, feature selection can have a major impact on the results. Several different approaches have been proposed for unsupervised feature selection from unprocessed single-cell expression matrices, most based upon identifying highly variable genes in the dataset. We present two methods which take advantage of the prevalence of zeros (dropouts) in scRNASeq data to identify features. We show that dropout-based feature selection outperforms variance-based feature selection for multiple applications of single-cell RNASeq. Footnotes * Updated to include a more thorough comparison of feature selection methods.
Bibliography:	SourceType-Working Papers-1 ObjectType-Working Paper/Pre-Print-1 content type line 50
ISSN:	2692-8205 2692-8205
DOI:	10.1101/065094