Formulating causal questions and principled statistical answers

Although review papers on causal inference methods are now available, there is a lack of introductory overviews on what they can render and on the guiding criteria for choosing one particular method. This tutorial gives an overview in situations where an exposure of interest is set at a chosen basel...

Full description

Saved in:
Bibliographic Details
Published inSTATISTICS IN MEDICINE Vol. 39; no. 30; pp. 4922 - 4948
Main Authors Goetghebeur, Els, le Cessie, Saskia, De Stavola, Bianca, Moodie, Erica EM, Waernbaum, Ingeborg
Format Journal Article Publication
LanguageEnglish
Published England Wiley Subscription Services, Inc 30.12.2020
John Wiley and Sons Inc
Subjects
Online AccessGet full text
ISSN0277-6715
1097-0258
1097-0258
DOI10.1002/sim.8741

Cover

Loading…
More Information
Summary:Although review papers on causal inference methods are now available, there is a lack of introductory overviews on what they can render and on the guiding criteria for choosing one particular method. This tutorial gives an overview in situations where an exposure of interest is set at a chosen baseline (“point exposure”) and the target outcome arises at a later time point. We first phrase relevant causal questions and make a case for being specific about the possible exposure levels involved and the populations for which the question is relevant. Using the potential outcomes framework, we describe principled definitions of causal effects and of estimation approaches classified according to whether they invoke the no unmeasured confounding assumption (including outcome regression and propensity score‐based methods) or an instrumental variable with added assumptions. We mainly focus on continuous outcomes and causal average treatment effects. We discuss interpretation, challenges, and potential pitfalls and illustrate application using a “simulation learner,” that mimics the effect of various breastfeeding interventions on a child's later development. This involves a typical simulation component with generated exposure, covariate, and outcome data inspired by a randomized intervention study. The simulation learner further generates various (linked) exposure types with a set of possible values per observation unit, from which observed as well as potential outcome data are generated. It thus provides true values of several causal effects. R code for data generation and analysis is available on www.ofcaus.org, where SAS and Stata code for analysis is also provided.
Bibliography:Funding information
Fonds de recherche du Québec, Santé, Chercheur‐boursier senior career award; Lorentz Center Leiden, Natural Sciences and Engineering Research Council (NSERC) of Canada, Discovery Grant #RGPIN‐2014‐05776; UK Medical Research Council Grant, MR/R025215/1; Vetenskapsrådet, 2016‐00703
ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
content type line 23
Funding information Fonds de recherche du Québec, Santé, Chercheur‐boursier senior career award; Lorentz Center Leiden, Natural Sciences and Engineering Research Council (NSERC) of Canada, Discovery Grant #RGPIN‐2014‐05776; UK Medical Research Council Grant, MR/R025215/1; Vetenskapsrådet, 2016‐00703
ISSN:0277-6715
1097-0258
1097-0258
DOI:10.1002/sim.8741