NEBULA is a fast negative binomial mixed model for differential or co-expression analysis of large-scale multi-subject single-cell data

The increasing availability of single-cell data revolutionizes the understanding of biological mechanisms at cellular resolution. For differential expression analysis in multi-subject single-cell data, negative binomial mixed models account for both subject-level and cell-level overdispersions, but...

Full description

Saved in:
Bibliographic Details
Published inCommunications Biology Vol. 4; no. 1; pp. 629 - 17
Main Authors He, Liang, Davila-Velderrain, Jose, Sumida, Tomokazu S., Hafler, David A., Kellis, Manolis, Kulminski, Alexander M.
Format Journal Article
LanguageEnglish
Published London Springer Science and Business Media LLC 26.05.2021
Nature Publishing Group UK
Nature Publishing Group
Nature Portfolio
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:The increasing availability of single-cell data revolutionizes the understanding of biological mechanisms at cellular resolution. For differential expression analysis in multi-subject single-cell data, negative binomial mixed models account for both subject-level and cell-level overdispersions, but are computationally demanding. Here, we propose an efficient NEgative Binomial mixed model Using a Large-sample Approximation (NEBULA). The speed gain is achieved by analytically solving high-dimensional integrals instead of using the Laplace approximation. We demonstrate that NEBULA is orders of magnitude faster than existing tools and controls false-positive errors in marker gene identification and co-expression analysis. Using NEBULA in Alzheimer’s disease cohort data sets, we found that the cell-level expression of APOE correlated with that of other genetic risk factors (including CLU, CST3, TREM2 , C1q, and ITM2B ) in a cell-type-specific pattern and an isoform-dependent manner in microglia. NEBULA opens up a new avenue for the broad application of mixed models to large-scale multi-subject single-cell data. The application of negative binomial mixed models (NBMMs) to single-cell data is computationally demanding. To address this issue, Liang He et al. have developed NEBULA, an efficient algorithm that can analyze differential gene expression or co-expression networks in multi-subject single-cell data sets, and validate it on snRNA-seq and scRNA-seq data sets comprising ~200k cells from cohorts of Alzheimer’s disease and multiple sclerosis patients.
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
content type line 23
ISSN:2399-3642
2399-3642
DOI:10.1038/s42003-021-02146-6