Fitting Penalized Estimator for Sparse Covariance Matrix with Left-Censored Data by the EM Algorithm

Estimating the sparse covariance matrix can effectively identify important features and patterns, and traditional estimation methods require complete data vectors on all subjects. When data are left-censored due to detection limits, common strategies such as excluding censored individuals or replaci...

Full description

Saved in:

Bibliographic Details
Published in	Mathematics (Basel) Vol. 13; no. 3; p. 423
Main Authors	Lin, Shanyi, Zheng, Qian-Zhen, Shang, Laixu, Xu, Ping-Feng, Tang, Man-Lai
Format	Journal Article
Language	English
Published	Basel MDPI AG 01.02.2025
Subjects	Algorithms Approximation Censorship Covariance matrix Estimation Estimators Expectation Maximization algorithm Fines & penalties Hypothesis testing left-censored data Measurement techniques Methods Multivariate analysis Normal distribution penalized estimator sparse covariance matrix Sparsity Variables
Online Access	Get full text

Cover

Loading…

More Information
Summary:	Estimating the sparse covariance matrix can effectively identify important features and patterns, and traditional estimation methods require complete data vectors on all subjects. When data are left-censored due to detection limits, common strategies such as excluding censored individuals or replacing censored values with suitable constants may result in large biases. In this paper, we propose two penalized log-likelihood estimators, incorporating the L1 penalty and SCAD penalty, for estimating the sparse covariance matrix of a multivariate normal distribution in the presence of left-censored data. However, the fitting of these penalized estimators poses challenges due to the observed log-likelihood involving high-dimensional integration over the censored variables. To address this issue, we treat censored data as a special case of incomplete data and employ the Expectation Maximization algorithm combined with the coordinate descent algorithm to efficiently fit the two penalized estimators. Through simulation studies, we demonstrate that both penalized estimators achieve greater estimation accuracy compared to methods that replace censored values with constants. Moreover, the SCAD penalized estimator generally outperforms the L1 penalized estimator. Our method is used to analyze the proteomic datasets.
Bibliography:	ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14
ISSN:	2227-7390 2227-7390
DOI:	10.3390/math13030423