Identification of survival relevant genes with measurement error in gene expression incorporated

Modern gene expression technologies, such as microarray and the next generation RNA sequencing, enable simultaneous measurement of expressions of a large number of genes, and therefore represent important tools in the personalized medicine research for improving the patient survival prediction accur...

Full description

Saved in:
Bibliographic Details
Published inCommunications in statistics. Theory and methods Vol. 52; no. 15; pp. 5155 - 5172
Main Authors Xiong, Juan, He, Wenqing
Format Journal Article
LanguageEnglish
Published Philadelphia Taylor & Francis 03.08.2023
Taylor & Francis Ltd
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:Modern gene expression technologies, such as microarray and the next generation RNA sequencing, enable simultaneous measurement of expressions of a large number of genes, and therefore represent important tools in the personalized medicine research for improving the patient survival prediction accuracy. However, survival analysis with gene expression data can be challenging due to the high dimensionality. Proper identification of survival relevant genes is thus imperative for building suitable prediction models. In spite of the fact that gene expressions are typically subject to measurement errors introduced from the complex experimental procedure, the issue of measurement error is often ignored in survival gene identifications. In this article, the effect of measurement error on the identification of survival relevant genes is explored under the accelerated failure time model setting. Survival relevant genes are identified by regularizing the weighted least square estimator with the adaptive LASSO penalty. The simulation-extrapolation method is incorporated to adjust for the impact of measurement error on gene identification. The performance of the proposed method is assessed by simulation studies and the utility of the proposed method is illustrated by a real data set collected from the diffuse large-B-cell lymphoma study. The results show that the proposed method yields better prediction models than traditional methods which ignore measurement error in gene expressions.
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
ISSN:0361-0926
1532-415X
DOI:10.1080/03610926.2021.2004424