Random Forests for time-fixed and time-dependent predictors: The DynForest R package

The R package DynForest implements random forests for predicting a continuous, a categorical or a (multiple causes) time-to-event outcome based on time-fixed and time-dependent predictors. The main originality of DynForest is that it handles time-dependent predictors that can be endogeneous (i.e., i...

Full description

Saved in:
Bibliographic Details
Published inarXiv.org
Main Authors Devaux, Anthony, Proust-Lima, Cécile, Genuer, Robin
Format Paper
LanguageEnglish
Published Ithaca Cornell University Library, arXiv.org 11.04.2024
Subjects
Online AccessGet full text
ISSN2331-8422

Cover

More Information
Summary:The R package DynForest implements random forests for predicting a continuous, a categorical or a (multiple causes) time-to-event outcome based on time-fixed and time-dependent predictors. The main originality of DynForest is that it handles time-dependent predictors that can be endogeneous (i.e., impacted by the outcome process), measured with error and measured at subject-specific times. At each recursive step of the tree building process, the time-dependent predictors are internally summarized into individual features on which the split can be done. This is achieved using flexible linear mixed models (thanks to the R package lcmm) which specification is pre-specified by the user. DynForest returns the mean for continuous outcome, the category with a majority vote for categorical outcome or the cumulative incidence function over time for survival outcome. DynForest also computes variable importance and minimal depth to inform on the most predictive variables or groups of variables. This paper aims to guide the user with step-by-step examples for fitting random forests using DynForest.
Bibliography:content type line 50
SourceType-Working Papers-1
ObjectType-Working Paper/Pre-Print-1
ISSN:2331-8422