A novel method for Causal Structure Discovery from EHR data, a demonstration on type-2 diabetes mellitus
Introduction: The discovery of causal mechanisms underlying diseases enables better diagnosis, prognosis and treatment selection. Clinical trials have been the gold standard for determining causality, but they are resource intensive, sometimes infeasible or unethical. Electronic Health Records (EHR)...
Saved in:
Main Authors | , , , , , |
---|---|
Format | Journal Article |
Language | English |
Published |
10.11.2020
|
Subjects | |
Online Access | Get full text |
Cover
Loading…
Summary: | Introduction: The discovery of causal mechanisms underlying diseases enables
better diagnosis, prognosis and treatment selection. Clinical trials have been
the gold standard for determining causality, but they are resource intensive,
sometimes infeasible or unethical. Electronic Health Records (EHR) contain a
wealth of real-world data that holds promise for the discovery of disease
mechanisms, yet the existing causal structure discovery (CSD) methods fall
short on leveraging them due to the special characteristics of the EHR data. We
propose a new data transformation method and a novel CSD algorithm to overcome
the challenges posed by these characteristics. Materials and methods: We
demonstrated the proposed methods on an application to type-2 diabetes
mellitus. We used a large EHR data set from Mayo Clinic to internally evaluate
the proposed transformation and CSD methods and used another large data set
from an independent health system, Fairview Health Services, as external
validation. We compared the performance of our proposed method to Fast Greedy
Equivalence Search (FGES), a state-of-the-art CSD method in terms of
correctness, stability and completeness. We tested the generalizability of the
proposed algorithm through external validation. Results and conclusions: The
proposed method improved over the existing methods by successfully
incorporating study design considerations, was robust in face of unreliable EHR
timestamps and inferred causal effect directions more correctly and reliably.
The proposed data transformation successfully improved the clinical correctness
of the discovered graph and the consistency of edge orientation across
bootstrap samples. It resulted in superior accuracy, stability, and
completeness. |
---|---|
DOI: | 10.48550/arxiv.2011.05489 |