NEBULA is a fast negative binomial mixed model for differential or co-expression analysis of large-scale multi-subject single-cell data

The increasing availability of single-cell data revolutionizes the understanding of biological mechanisms at cellular resolution. For differential expression analysis in multi-subject single-cell data, negative binomial mixed models account for both subject-level and cell-level overdispersions, but...

Full description

Saved in:

Bibliographic Details
Published in	Communications Biology Vol. 4; no. 1; pp. 629 - 17
Main Authors	He, Liang, Davila-Velderrain, Jose, Sumida, Tomokazu S., Hafler, David A., Kellis, Manolis, Kulminski, Alexander M.
Format	Journal Article
Language	English
Published	London Springer Science and Business Media LLC 26.05.2021 Nature Publishing Group UK Nature Publishing Group Nature Portfolio
Subjects	38/39 38/91 631/114/2397 631/337/2019 692/617/375/365/1283 Alzheimer Disease Alzheimer Disease - genetics Alzheimer's disease Apolipoprotein E Apolipoproteins E Apolipoproteins E - genetics Approximation Binomial Distribution Biology Biology (General) Biomedical and Life Sciences Cell culture Computational Biology Computational Biology - methods Datasets Gene Expression Gene Expression - genetics Gene Expression Profiling Gene Expression Profiling - methods Humans Life Sciences Microglia Microglia - metabolism Models, Statistical Multiple sclerosis Neurodegenerative diseases QH301-705.5 Risk factors Single-Cell Analysis Single-Cell Analysis - methods snRNA
Online Access	Get full text

Cover

Loading…

More Information
Summary:	The increasing availability of single-cell data revolutionizes the understanding of biological mechanisms at cellular resolution. For differential expression analysis in multi-subject single-cell data, negative binomial mixed models account for both subject-level and cell-level overdispersions, but are computationally demanding. Here, we propose an efficient NEgative Binomial mixed model Using a Large-sample Approximation (NEBULA). The speed gain is achieved by analytically solving high-dimensional integrals instead of using the Laplace approximation. We demonstrate that NEBULA is orders of magnitude faster than existing tools and controls false-positive errors in marker gene identification and co-expression analysis. Using NEBULA in Alzheimer’s disease cohort data sets, we found that the cell-level expression of APOE correlated with that of other genetic risk factors (including CLU, CST3, TREM2 , C1q, and ITM2B ) in a cell-type-specific pattern and an isoform-dependent manner in microglia. NEBULA opens up a new avenue for the broad application of mixed models to large-scale multi-subject single-cell data. The application of negative binomial mixed models (NBMMs) to single-cell data is computationally demanding. To address this issue, Liang He et al. have developed NEBULA, an efficient algorithm that can analyze differential gene expression or co-expression networks in multi-subject single-cell data sets, and validate it on snRNA-seq and scRNA-seq data sets comprising ~200k cells from cohorts of Alzheimer’s disease and multiple sclerosis patients.
Bibliography:	ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14 content type line 23
ISSN:	2399-3642 2399-3642
DOI:	10.1038/s42003-021-02146-6