Dropout-based feature selection for scRNASeq
Features selection is a key step in many single-cell RNASeq (scRNASeq) analyses. Feature selection is intended to preserve biologically relevant information while removing genes only subject to technical noise. As it is frequently performed prior to dimensionality reduction, clustering and pseudotim...
Saved in:
Published in | bioRxiv |
---|---|
Main Authors | , |
Format | Paper |
Language | English |
Published |
Cold Spring Harbor
Cold Spring Harbor Laboratory Press
17.05.2018
Cold Spring Harbor Laboratory |
Edition | 1.4 |
Subjects | |
Online Access | Get full text |
ISSN | 2692-8205 2692-8205 |
DOI | 10.1101/065094 |
Cover
Summary: | Features selection is a key step in many single-cell RNASeq (scRNASeq) analyses. Feature selection is intended to preserve biologically relevant information while removing genes only subject to technical noise. As it is frequently performed prior to dimensionality reduction, clustering and pseudotime analyses, feature selection can have a major impact on the results. Several different approaches have been proposed for unsupervised feature selection from unprocessed single-cell expression matrices, most based upon identifying highly variable genes in the dataset. We present two methods which take advantage of the prevalence of zeros (dropouts) in scRNASeq data to identify features. We show that dropout-based feature selection outperforms variance-based feature selection for multiple applications of single-cell RNASeq. Footnotes * Updated to include a more thorough comparison of feature selection methods. |
---|---|
Bibliography: | SourceType-Working Papers-1 ObjectType-Working Paper/Pre-Print-1 content type line 50 |
ISSN: | 2692-8205 2692-8205 |
DOI: | 10.1101/065094 |