Temporal phenotyping of medically complex children via PARAFAC2 tensor factorization

[Display omitted] •Raw electronic health records are often too complex to provide an intuitive understanding of patient phenotypes and their evolution.•To avoid the time-consuming chart review, we propose an unsupervised computational framework that extracts phenotypes and their temporal trends with...

Full description

Saved in:
Bibliographic Details
Published inJournal of biomedical informatics Vol. 93; p. 103125
Main Authors Perros, Ioakeim, Papalexakis, Evangelos E., Vuduc, Richard, Searles, Elizabeth, Sun, Jimeng
Format Journal Article
LanguageEnglish
Published United States Elsevier Inc 01.05.2019
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:[Display omitted] •Raw electronic health records are often too complex to provide an intuitive understanding of patient phenotypes and their evolution.•To avoid the time-consuming chart review, we propose an unsupervised computational framework that extracts phenotypes and their temporal trends without precise phenotype labels.•We study a medically-complex children’s cohort and identified four phenotypes which are validated by a clinical expert and significant survival variations among different phenotypes. Our aim is to extract clinically-meaningful phenotypes from longitudinal electronic health records (EHRs) of medically-complex children. This is a fragile set of patients consuming a disproportionate amount of pediatric care resources but who often end up with sub-optimal clinical outcome. The rise in available electronic health records (EHRs) provide a rich data source that can be used to disentangle their complex clinical conditions into concise, clinically-meaningful groups of characteristics. We aim at identifying those phenotypes and their temporal evolution in a scalable, computational manner, which avoids the time-consuming manual chart review. We analyze longitudinal EHRs from Children's Healthcare of Atlanta including 1045 medically complex patients with a total of 59,948 encounters over 2 years. We apply a tensor factorization method called PARAFAC2 to extract: (a) clinically-meaningful groups of features (b) concise patient representations indicating the presence of a phenotype for each patient, and (c) temporal signatures indicating the evolution of those phenotypes over time for each patient. We identified four medically complex phenotypes, namely gastrointestinal disorders, oncological conditions, blood-related disorders, and neurological system disorders, which have distinct clinical characterizations among patients. We demonstrate the utility of patient representations produced by PARAFAC2, towards identifying groups of patients with significant survival variations. Finally, we showcase representative examples of the temporal phenotypic trends extracted for different patients. Unsupervised temporal phenotyping is an important task since it minimizes the burden on behalf of clinical experts, by relegating their involvement in the output phenotypes’ validation. PARAFAC2 enjoys several compelling properties towards temporal computational phenotyping: (a) it is able to handle high-dimensional data and variable numbers of encounters across patients, (b) it has an intuitive interpretation and (c) it is free from ad-hoc parameter choices. Computational phenotypes, such as the ones computed by our approach, have multiple applications; we highlight three of them which are particularly useful for medically complex children: (1) integration into clinical decision support systems, (2) interpretable mortality prediction and 3) clinical trial recruitment. PARAFAC2 can be applied to unsupervised temporal phenotyping tasks where precise definitions of different phenotypes are absent, and lengths of patient records are varying.
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 23
ISSN:1532-0464
1532-0480
DOI:10.1016/j.jbi.2019.103125