A support vector machine-based cure rate model for interval censored data

The mixture cure rate model is the most commonly used cure rate model in the literature. In the context of mixture cure rate model, the standard approach to model the effect of covariates on the cured or uncured probability is to use a logistic function. This readily implies that the boundary classi...

Full description

Saved in:
Bibliographic Details
Published inStatistical methods in medical research Vol. 32; no. 12; pp. 2405 - 2422
Main Authors Pal, Suvra, Peng, Yingwei, Aselisewine, Wisdom, Barui, Sandip
Format Journal Article
LanguageEnglish
Published London, England SAGE Publications 01.12.2023
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:The mixture cure rate model is the most commonly used cure rate model in the literature. In the context of mixture cure rate model, the standard approach to model the effect of covariates on the cured or uncured probability is to use a logistic function. This readily implies that the boundary classifying the cured and uncured subjects is linear. In this article, we propose a new mixture cure rate model based on interval censored data that uses the support vector machine to model the effect of covariates on the uncured or the cured probability (i.e. on the incidence part of the model). Our proposed model inherits the features of the support vector machine and provides flexibility to capture classification boundaries that are nonlinear and more complex. The latency part is modeled by a proportional hazards structure with an unspecified baseline hazard function. We develop an estimation procedure based on the expectation maximization algorithm to estimate the cured/uncured probability and the latency model parameters. Our simulation study results show that the proposed model performs better in capturing complex classification boundaries when compared to both logistic regression-based and spline regression-based mixture cure rate models. We also show that our model’s ability to capture complex classification boundaries improve the estimation results corresponding to the latency part of the model. For illustrative purpose, we present our analysis by applying the proposed methodology to the NASA’s Hypobaric Decompression Sickness Database.
ISSN:0962-2802
1477-0334
DOI:10.1177/09622802231210917