Comparison of Cox Model Methods in A Low-dimensional Setting with Few Events

Prognostic models based on survival data frequently make use of the Cox proportional hazards model. Developing reliable Cox models with few events relative to the number of predictors can be challenging, even in low-dimensional datasets, with a much larger number of observations than variables. In s...

Full description

Saved in:

Bibliographic Details
Published in	Genomics, proteomics & bioinformatics Vol. 14; no. 4; pp. 235 - 243
Main Authors	Ojeda, Francisco M., Müller, Christian, Börnigen, Daniela, Trégouët, David-Alexandre, Schillert, Arne, Heinig, Matthias, Zeller, Tanja, Schnabel, Renate B.
Format	Journal Article
Language	English
Published	China Elsevier Ltd 01.08.2016 Department of General and Interventional Cardiology, University Heart Center Hamburg-Eppendorf, 20246 Hamburg, Germany%Sorbonne Universite′s, Universite′ Pierre et Marie Curie Paris 06, Institut National pour la Sante′ et la Recherche Me′dicale INSERM, Unite′ Mixte de Recherche en Sante′ UMR_S 1166, F-75013 Paris, France%Institut fu¨r Medizinische Biometrie und Statistik, Universita¨t zu Lu¨beck, Universita¨tsklinikum Schleswig-Holstein, Campus Lu¨beck, 23562 Lu¨beck, Germany%Institute of Computational Biology, German Research Center for Environmental Health, Helmholtz Zentrum Mu¨nchen, 85764 Neuherberg, Germany Elsevier Oxford University Press
Subjects	Bioinformatics Biomarkers - blood Biomarkers - metabolism Computer Science Coronary artery disease Coronary Artery Disease - diagnosis Coronary Artery Disease - genetics Cox模型 Events per variable Genetic Variation Humans Original Research Penalized regression Prognosis Proportional Hazards Models Proportional hazards regression Prospective Studies 事件估计方法低维冠状动脉疾病生物标志物设置预测模型 Proportional hazards regression Events per variable Penalized regression Coronary artery disease Proportional hazards regres-sion
Online Access	Get full text

Cover

Loading…

More Information
Summary:	Prognostic models based on survival data frequently make use of the Cox proportional hazards model. Developing reliable Cox models with few events relative to the number of predictors can be challenging, even in low-dimensional datasets, with a much larger number of observations than variables. In such a setting we examined the performance of methods used to estimate a Cox model, including （i） full model using all available predictors and estimated by standard techniques, （ii） backward elimination （BE）, （iii） ridge regression, （iv） least absolute shrinkage and selection operator （lasso）, and （v） elastic net. Based on a prospective cohort of patients with manifest coronary artery disease （CAD）, we performed a simulation study to compare the predictive accuracy, calibration, and discrimination of these approaches, Candidate predictors for incident cardiovascular events we used included clinical variables, biomarkers, and a selection of genetic variants associated with CAD. The penalized methods, i.e., ridge, lasso, and elastic net, showed a comparable performance, in terms of predictive accuracy, calibration, and discrimination, and outperformed BE and the full model. Excessive shrinkage was observed in some cases for the penalized methods, mostly on the simulation scenarios having the lowest ratio of a number of events to the number of variables. We conclude that in similar settings, these three penalized methods can be used interchangeably. The full model and backward elimination are not recommended in rare event scenarios.
Bibliography:	Prognostic models based on survival data frequently make use of the Cox proportional hazards model. Developing reliable Cox models with few events relative to the number of predictors can be challenging, even in low-dimensional datasets, with a much larger number of observations than variables. In such a setting we examined the performance of methods used to estimate a Cox model, including （i） full model using all available predictors and estimated by standard techniques, （ii） backward elimination （BE）, （iii） ridge regression, （iv） least absolute shrinkage and selection operator （lasso）, and （v） elastic net. Based on a prospective cohort of patients with manifest coronary artery disease （CAD）, we performed a simulation study to compare the predictive accuracy, calibration, and discrimination of these approaches, Candidate predictors for incident cardiovascular events we used included clinical variables, biomarkers, and a selection of genetic variants associated with CAD. The penalized methods, i.e., ridge, lasso, and elastic net, showed a comparable performance, in terms of predictive accuracy, calibration, and discrimination, and outperformed BE and the full model. Excessive shrinkage was observed in some cases for the penalized methods, mostly on the simulation scenarios having the lowest ratio of a number of events to the number of variables. We conclude that in similar settings, these three penalized methods can be used interchangeably. The full model and backward elimination are not recommended in rare event scenarios. Proportional hazards regression;Penalized regression;Events per variable;Coronary artery disease 11-4926/Q ORCID: 0000-0002-9449-6865. ORCID: 0000-0002-8170-6632. ORCID: 0000-0003-4037-144X. ORCID: 0000-0002-5612-1720. ORCID: 0000-0001-9084-7800. ORCID: 0000-0001-7170-9509. ORCID: 0000-0003-3379-2641. ORCID: 0000-0002-7370-2033.
ISSN:	1672-0229 2210-3244
DOI:	10.1016/j.gpb.2016.03.006