Statistical Inference of Cell-Type Proportions Estimated from Bulk Expression Data

There is a growing interest in cell-type-specific analysis from bulk samples with a mixture of different cell types. A critical first step in such analyses is the accurate estimation of cell-type proportions in a bulk sample. Although many methods have been proposed recently, quantifying the uncerta...

Full description

Saved in:
Bibliographic Details
Published inJournal of the American Statistical Association Vol. 119; no. 548; pp. 2521 - 2532
Main Authors Cai, Biao, Zhang, Jingfei, Li, Hongyu, Su, Chang, Zhao, Hongyu
Format Journal Article
LanguageEnglish
Published United States Taylor & Francis Ltd 2024
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:There is a growing interest in cell-type-specific analysis from bulk samples with a mixture of different cell types. A critical first step in such analyses is the accurate estimation of cell-type proportions in a bulk sample. Although many methods have been proposed recently, quantifying the uncertainties associated with the estimated cell-type proportions has not been well studied. Lack of consideration of these uncertainties can lead to missed or false findings in downstream analyses. In this article, we introduce a flexible statistical deconvolution framework that allows a general and subject-specific covariance of bulk gene expressions. Under this framework, we propose a decorrelated constrained least squares method called DECALS that estimates cell-type proportions as well as the sampling distribution of the estimates. Simulation studies demonstrate that DECALS can accurately quantify the uncertainties in the estimated proportions whereas other methods fail. Applying DECALS to analyze bulk gene expression data of post mortem brain samples from the ROSMAP and GTEx projects, we show that taking into account the uncertainties in the estimated cell-type proportions can lead to more accurate identifications of cell-type-specific differentially expressed genes and transcripts between different subject groups, such as between Alzheimer's disease patients and controls and between males and females.
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
ObjectType-Report-3
ObjectType-Case Study-4
content type line 23
ISSN:0162-1459
1537-274X
1537-274X
DOI:10.1080/01621459.2024.2382435