How and when informative visit processes can bias inference when using electronic health records data for clinical research

Electronic health records (EHR) data have become a central data source for clinical research. One concern for using EHR data is that the process through which individuals engage with the health system, and find themselves within EHR data, can be informative. We have termed this process informed pres...

Full description

Saved in:
Bibliographic Details
Published inJournal of the American Medical Informatics Association : JAMIA Vol. 26; no. 12; pp. 1609 - 1617
Main Authors Goldstein, Benjamin A, Phelan, Matthew, Pagidipati, Neha J, Peskoe, Sarah B
Format Journal Article
LanguageEnglish
Published England Oxford University Press 01.12.2019
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:Electronic health records (EHR) data have become a central data source for clinical research. One concern for using EHR data is that the process through which individuals engage with the health system, and find themselves within EHR data, can be informative. We have termed this process informed presence. In this study we use simulation and real data to assess how the informed presence can impact inference. We first simulated a visit process where a series of biomarkers were observed informatively and uninformatively over time. We further compared inference derived from a randomized control trial (ie, uninformative visits) and EHR data (ie, potentially informative visits). We find that only when there is both a strong association between the biomarker and the outcome as well as the biomarker and the visit process is there bias. Moreover, once there are some uninformative visits this bias is mitigated. In the data example we find, that when the "true" associations are null, there is no observed bias. These results suggest that an informative visit process can exaggerate an association but cannot induce one. Furthermore, careful study design can, mitigate the potential bias when some noninformative visits are included. While there are legitimate concerns regarding biases that "messy" EHR data may induce, the conditions for such biases are extreme and can be accounted for.
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 23
ISSN:1527-974X
1067-5027
1527-974X
DOI:10.1093/jamia/ocz148