Use of directed acyclic graphs (DAGs) to identify confounders in applied health research: review and recommendations

Directed acyclic graphs (DAGs) are an increasingly popular approach for identifying confounding variables that require conditioning when estimating causal effects. This review examined the use of DAGs in applied health research to inform recommendations for improving their transparency and utility i...

Full description

Saved in:
Bibliographic Details
Published inInternational journal of epidemiology Vol. 50; no. 2; pp. 620 - 632
Main Authors Tennant, Peter W G, Murray, Eleanor J, Arnold, Kellyn F, Berrie, Laurie, Fox, Matthew P, Gadd, Sarah C, Harrison, Wendy J, Keeble, Claire, Ranker, Lynsie R, Textor, Johannes, Tomova, Georgia D, Gilthorpe, Mark S, Ellison, George T H
Format Journal Article
LanguageEnglish
Published England Oxford University Press 17.05.2021
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:Directed acyclic graphs (DAGs) are an increasingly popular approach for identifying confounding variables that require conditioning when estimating causal effects. This review examined the use of DAGs in applied health research to inform recommendations for improving their transparency and utility in future research. Original health research articles published during 1999-2017 mentioning 'directed acyclic graphs' (or similar) or citing DAGitty were identified from Scopus, Web of Science, Medline and Embase. Data were extracted on the reporting of: estimands, DAGs and adjustment sets, alongside the characteristics of each article's largest DAG. A total of 234 articles were identified that reported using DAGs. A fifth (n = 48, 21%) reported their target estimand(s) and half (n = 115, 48%) reported the adjustment set(s) implied by their DAG(s). Two-thirds of the articles (n = 144, 62%) made at least one DAG available. DAGs varied in size but averaged 12 nodes [interquartile range (IQR): 9-16, range: 3-28] and 29 arcs (IQR: 19-42, range: 3-99). The median saturation (i.e. percentage of total possible arcs) was 46% (IQR: 31-67, range: 12-100). 37% (n = 53) of the DAGs included unobserved variables, 17% (n = 25) included 'super-nodes' (i.e. nodes containing more than one variable) and 34% (n = 49) were visually arranged so that the constituent arcs flowed in the same direction (e.g. top-to-bottom). There is substantial variation in the use and reporting of DAGs in applied health research. Although this partly reflects their flexibility, it also highlights some potential areas for improvement. This review hence offers several recommendations to improve the reporting and use of DAGs in future research.
Bibliography:ObjectType-Article-2
SourceType-Scholarly Journals-1
ObjectType-Feature-3
content type line 23
ObjectType-Review-1
Joint senior authors.
ISSN:0300-5771
1464-3685
DOI:10.1093/ije/dyaa213