Ensemble learning of inverse probability weights for marginal structural modeling in large observational datasets

Inverse probability weights used to fit marginal structural models are typically estimated using logistic regression. However, a data‐adaptive procedure may be able to better exploit information available in measured covariates. By combining predictions from multiple algorithms, ensemble learning of...

Full description

Saved in:

Bibliographic Details
Published in	Statistics in medicine Vol. 34; no. 1; pp. 106 - 117
Main Authors	Gruber, Susan, Logan, Roger W., Jarrín, Inmaculada, Monge, Susana, Hernán, Miguel A.
Format	Journal Article
Language	English
Published	England Blackwell Publishing Ltd 15.01.2015 Wiley Subscription Services, Inc
Subjects	Algorithms Antiretroviral Therapy, Highly Active - statistics & numerical data Artificial intelligence Bias Computer Simulation Confidence Intervals Confounding Factors (Epidemiology) Data Interpretation, Statistical data-adaptive ensemble learning HIV HIV Infections - drug therapy HIV Infections - mortality HIV Infections - prevention & control Human immunodeficiency virus Humans inverse probability weighting Logistic Models longitudinal data Machine Learning marginal structural model Medical statistics Models, Statistical Mortality Mortality - trends Probability Regression analysis Spain super learning Spain ensemble learning data-adaptive inverse probability weighting marginal structural model super learning longitudinal data
Online Access	Get full text

Cover

Loading…

More Information
Summary:	Inverse probability weights used to fit marginal structural models are typically estimated using logistic regression. However, a data‐adaptive procedure may be able to better exploit information available in measured covariates. By combining predictions from multiple algorithms, ensemble learning offers an alternative to logistic regression modeling to further reduce bias in estimated marginal structural model parameters. We describe the application of two ensemble learning approaches to estimating stabilized weights: super learning (SL), an ensemble machine learning approach that relies on V‐fold cross validation, and an ensemble learner (EL) that creates a single partition of the data into training and validation sets. Longitudinal data from two multicenter cohort studies in Spain (CoRIS and CoRIS‐MD) were analyzed to estimate the mortality hazard ratio for initiation versus no initiation of combined antiretroviral therapy among HIV positive subjects. Both ensemble approaches produced hazard ratio estimates further away from the null, and with tighter confidence intervals, than logistic regression modeling. Computation time for EL was less than half that of SL. We conclude that ensemble learning using a library of diverse candidate algorithms offers an alternative to parametric modeling of inverse probability weights when fitting marginal structural models. With large datasets, EL provides a rich search over the solution space in less time than SL with comparable results. Copyright © 2014 John Wiley & Sons, Ltd.
Bibliography:	ArticleID:SIM6322 istex:7A0C99876A785994C06D85242671F1CF71FD942F Supporting Info Item ark:/67375/WNG-3VWD77X0-3 SourceType-Scholarly Journals-1 ObjectType-Feature-1 content type line 14 ObjectType-Article-1 ObjectType-Feature-2 content type line 23
ISSN:	0277-6715 1097-0258 1097-0258
DOI:	10.1002/sim.6322