Improved nonparametric survival prediction using CoxPH, Random Survival Forest & DeepHit Neural Network

In recent times, time-to-event data such as time to failure or death is routinely collected alongside high-throughput covariates. These high-dimensional bioinformatics data often challenge classical survival models, which are either infeasible to fit or produce low prediction accuracy due to overfit...

Full description

Saved in:
Bibliographic Details
Published inBMC medical informatics and decision making Vol. 24; no. 1; pp. 120 - 17
Main Authors Asghar, Naseem, Khalil, Umair, Ahmad, Basheer, Alshanbari, Huda M, Hamraz, Muhammad, Ahmad, Bakhtiyar, Khan, Dost Muhammad
Format Journal Article
LanguageEnglish
Published England BioMed Central Ltd 07.05.2024
BioMed Central
BMC
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:In recent times, time-to-event data such as time to failure or death is routinely collected alongside high-throughput covariates. These high-dimensional bioinformatics data often challenge classical survival models, which are either infeasible to fit or produce low prediction accuracy due to overfitting. To address this issue, the focus has shifted towards introducing a novel approaches for feature selection and survival prediction. In this article, we propose a new hybrid feature selection approach that handles high-dimensional bioinformatics datasets for improved survival prediction. This study explores the efficacy of four distinct variable selection techniques: LASSO, RSF-vs, SCAD, and CoxBoost, in the context of non-parametric biomedical survival prediction. Leveraging these methods, we conducted comprehensive variable selection processes. Subsequently, survival analysis models-specifically CoxPH, RSF, and DeepHit NN-were employed to construct predictive models based on the selected variables. Furthermore, we introduce a novel approach wherein only variables consistently selected by a majority of the aforementioned feature selection techniques are considered. This innovative strategy, referred to as the proposed method, aims to enhance the reliability and robustness of variable selection, subsequently improving the predictive performance of the survival analysis models. To evaluate the effectiveness of the proposed method, we compare the performance of the proposed approach with the existing LASSO, RSF-vs, SCAD, and CoxBoost techniques using various performance metrics including integrated brier score (IBS), concordance index (C-Index) and integrated absolute error (IAE) for numerous high-dimensional survival datasets. The real data applications reveal that the proposed method outperforms the competing methods in terms of survival prediction accuracy.
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 23
ISSN:1472-6947
1472-6947
DOI:10.1186/s12911-024-02525-z