Validation of an internationally derived patient severity phenotype to support COVID-19 analytics from electronic health record data

Abstract Objective The Consortium for Clinical Characterization of COVID-19 by EHR (4CE) is an international collaboration addressing coronavirus disease 2019 (COVID-19) with federated analyses of electronic health record (EHR) data. We sought to develop and validate a computable phenotype for COVID...

Full description

Saved in:

Bibliographic Details
Published in	Journal of the American Medical Informatics Association Vol. 28; no. 7; pp. 1411 - 1420
Main Authors	Klann, Jeffrey G, Estiri, Hossein, Weber, Griffin M, Moal, Bertrand, Avillach, Paul, Hong, Chuan, Tan, Amelia L M, Beaulieu-Jones, Brett K, Castro, Victor, Maulhardt, Thomas, Geva, Alon, Malovini, Alberto, South, Andrew M, Visweswaran, Shyam, Morris, Michele, Samayamuthu, Malarkodi J, Omenn, Gilbert S, Ngiam, Kee Yuan, Mandl, Kenneth D, Boeker, Martin, Olson, Karen L, Mowery, Danielle L, Follett, Robert W, Hanauer, David A, Bellazzi, Riccardo, Moore, Jason H, Loh, Ne-Hooi Will, Bell, Douglas S, Wagholikar, Kavishwar B, Chiovato, Luca, Tibollo, Valentina, Rieg, Siegbert, Li, Anthony L L J, Jouhet, Vianney, Schriver, Emily, Xia, Zongqi, Hutch, Meghan, Luo, Yuan, Kohane, Isaac S, Brat, Gabriel A, Murphy, Shawn N
Format	Journal Article Web Resource
Language	English
Published	England Oxford University Press 14.07.2021 BMJ Publishing Group
Subjects	COVID-19 - classification Electronic Health Records Hospitalization Humans Life Sciences Machine Learning Prognosis Research and Applications ROC Curve Santé publique et épidémiologie Sensitivity and Specificity Severity of Illness Index data networking novel coronavirus medical informatics data interoperability disease severity computable phenotype Disease severity Data interoperability Computable phenotype Novel coronavirus Medical informatics Data networking
Online Access	Get full text

Cover

Loading…

More Information
Summary:	Abstract Objective The Consortium for Clinical Characterization of COVID-19 by EHR (4CE) is an international collaboration addressing coronavirus disease 2019 (COVID-19) with federated analyses of electronic health record (EHR) data. We sought to develop and validate a computable phenotype for COVID-19 severity. Materials and Methods Twelve 4CE sites participated. First, we developed an EHR-based severity phenotype consisting of 6 code classes, and we validated it on patient hospitalization data from the 12 4CE clinical sites against the outcomes of intensive care unit (ICU) admission and/or death. We also piloted an alternative machine learning approach and compared selected predictors of severity with the 4CE phenotype at 1 site. Results The full 4CE severity phenotype had pooled sensitivity of 0.73 and specificity 0.83 for the combined outcome of ICU admission and/or death. The sensitivity of individual code categories for acuity had high variability—up to 0.65 across sites. At one pilot site, the expert-derived phenotype had mean area under the curve of 0.903 (95% confidence interval, 0.886-0.921), compared with an area under the curve of 0.956 (95% confidence interval, 0.952-0.959) for the machine learning approach. Billing codes were poor proxies of ICU admission, with as low as 49% precision and recall compared with chart review. Discussion We developed a severity phenotype using 6 code classes that proved resilient to coding variability across international institutions. In contrast, machine learning approaches may overfit hospital-specific orders. Manual chart review revealed discrepancies even in the gold-standard outcomes, possibly owing to heterogeneous pandemic conditions. Conclusions We developed an EHR-based severity phenotype for COVID-19 in hospitalized patients and validated it at 12 international sites.
Bibliography:	ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 23 ObjectType-Undefined-3
ISSN:	1527-974X 1067-5027 1527-974X
DOI:	10.1093/jamia/ocab018