Dropout-based feature selection for scRNASeq

Features selection is a key step in many single-cell RNASeq (scRNASeq) analyses. Feature selection is intended to preserve biologically relevant information while removing genes only subject to technical noise. As it is frequently performed prior to dimensionality reduction, clustering and pseudotim...

Full description

Saved in:
Bibliographic Details
Published inbioRxiv
Main Authors Andrews, Tallulah S, Hemberg, Martin
Format Paper
LanguageEnglish
Published Cold Spring Harbor Cold Spring Harbor Laboratory Press 17.05.2018
Cold Spring Harbor Laboratory
Edition1.4
Subjects
Online AccessGet full text
ISSN2692-8205
2692-8205
DOI10.1101/065094

Cover

More Information
Summary:Features selection is a key step in many single-cell RNASeq (scRNASeq) analyses. Feature selection is intended to preserve biologically relevant information while removing genes only subject to technical noise. As it is frequently performed prior to dimensionality reduction, clustering and pseudotime analyses, feature selection can have a major impact on the results. Several different approaches have been proposed for unsupervised feature selection from unprocessed single-cell expression matrices, most based upon identifying highly variable genes in the dataset. We present two methods which take advantage of the prevalence of zeros (dropouts) in scRNASeq data to identify features. We show that dropout-based feature selection outperforms variance-based feature selection for multiple applications of single-cell RNASeq. Footnotes * Updated to include a more thorough comparison of feature selection methods.
Bibliography:SourceType-Working Papers-1
ObjectType-Working Paper/Pre-Print-1
content type line 50
ISSN:2692-8205
2692-8205
DOI:10.1101/065094