NEBULA is a fast negative binomial mixed model for differential or co-expression analysis of large-scale multi-subject single-cell data
The increasing availability of single-cell data revolutionizes the understanding of biological mechanisms at cellular resolution. For differential expression analysis in multi-subject single-cell data, negative binomial mixed models account for both subject-level and cell-level overdispersions, but...
Saved in:
Published in | Communications Biology Vol. 4; no. 1; pp. 629 - 17 |
---|---|
Main Authors | , , , , , |
Format | Journal Article |
Language | English |
Published |
London
Springer Science and Business Media LLC
26.05.2021
Nature Publishing Group UK Nature Publishing Group Nature Portfolio |
Subjects | |
Online Access | Get full text |
Cover
Loading…
Summary: | The increasing availability of single-cell data revolutionizes the understanding of biological mechanisms at cellular resolution. For differential expression analysis in multi-subject single-cell data, negative binomial mixed models account for both subject-level and cell-level overdispersions, but are computationally demanding. Here, we propose an efficient NEgative Binomial mixed model Using a Large-sample Approximation (NEBULA). The speed gain is achieved by analytically solving high-dimensional integrals instead of using the Laplace approximation. We demonstrate that NEBULA is orders of magnitude faster than existing tools and controls false-positive errors in marker gene identification and co-expression analysis. Using NEBULA in Alzheimer’s disease cohort data sets, we found that the cell-level expression of
APOE
correlated with that of other genetic risk factors (including
CLU, CST3, TREM2
, C1q, and
ITM2B
) in a cell-type-specific pattern and an isoform-dependent manner in microglia. NEBULA opens up a new avenue for the broad application of mixed models to large-scale multi-subject single-cell data.
The application of negative binomial mixed models (NBMMs) to single-cell data is computationally demanding. To address this issue, Liang He et al. have developed NEBULA, an efficient algorithm that can analyze differential gene expression or co-expression networks in multi-subject single-cell data sets, and validate it on snRNA-seq and scRNA-seq data sets comprising ~200k cells from cohorts of Alzheimer’s disease and multiple sclerosis patients. |
---|---|
Bibliography: | ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14 content type line 23 |
ISSN: | 2399-3642 2399-3642 |
DOI: | 10.1038/s42003-021-02146-6 |