Functional clustering methods for longitudinal data with application to electronic health records

We develop a method to estimate subject-level trajectory functions from longitudinal data. The approach can be used for patient phenotyping, feature extraction, or, as in our motivating example, outcome identification, which refers to the process of identifying disease status through patient laborat...

Full description

Saved in:
Bibliographic Details
Published inStatistical methods in medical research Vol. 30; no. 3; p. 655
Main Authors Zeldow, Bret, Flory, James, Stephens-Shields, Alisa, Raebel, Marsha, Roy, Jason A
Format Journal Article
LanguageEnglish
Published England 01.03.2021
Subjects
Online AccessGet more information

Cover

Loading…
Abstract We develop a method to estimate subject-level trajectory functions from longitudinal data. The approach can be used for patient phenotyping, feature extraction, or, as in our motivating example, outcome identification, which refers to the process of identifying disease status through patient laboratory tests rather than through diagnosis codes or prescription information. We model the joint distribution of a continuous longitudinal outcome and baseline covariates using an enriched Dirichlet process prior. This joint model decomposes into (local) semiparametric linear mixed models for the outcome given the covariates and simple (local) marginals for the covariates. The nonparametric enriched Dirichlet process prior is placed on the regression and spline coefficients, the error variance, and the parameters governing the predictor space. This leads to clustering of patients based on their outcomes and covariates. We predict the outcome at unobserved time points for subjects with data at other time points as well as for new subjects with only baseline covariates. We find improved prediction over mixed models with Dirichlet process priors when there are a large number of covariates. Our method is demonstrated with electronic health records consisting of initiators of second-generation antipsychotic medications, which are known to increase the risk of diabetes. We use our model to predict laboratory values indicative of diabetes for each individual and assess incidence of suspected diabetes from the predicted dataset.
AbstractList We develop a method to estimate subject-level trajectory functions from longitudinal data. The approach can be used for patient phenotyping, feature extraction, or, as in our motivating example, outcome identification, which refers to the process of identifying disease status through patient laboratory tests rather than through diagnosis codes or prescription information. We model the joint distribution of a continuous longitudinal outcome and baseline covariates using an enriched Dirichlet process prior. This joint model decomposes into (local) semiparametric linear mixed models for the outcome given the covariates and simple (local) marginals for the covariates. The nonparametric enriched Dirichlet process prior is placed on the regression and spline coefficients, the error variance, and the parameters governing the predictor space. This leads to clustering of patients based on their outcomes and covariates. We predict the outcome at unobserved time points for subjects with data at other time points as well as for new subjects with only baseline covariates. We find improved prediction over mixed models with Dirichlet process priors when there are a large number of covariates. Our method is demonstrated with electronic health records consisting of initiators of second-generation antipsychotic medications, which are known to increase the risk of diabetes. We use our model to predict laboratory values indicative of diabetes for each individual and assess incidence of suspected diabetes from the predicted dataset.
Author Stephens-Shields, Alisa
Raebel, Marsha
Zeldow, Bret
Flory, James
Roy, Jason A
Author_xml – sequence: 1
  givenname: Bret
  orcidid: 0000-0002-3651-7365
  surname: Zeldow
  fullname: Zeldow, Bret
  organization: Department of Mathematics and Statistics, Colby College, Waterville, ME, USA
– sequence: 2
  givenname: James
  surname: Flory
  fullname: Flory, James
  organization: Department of Medicine, Weill Cornell Medical College, New York, NY, USA
– sequence: 3
  givenname: Alisa
  surname: Stephens-Shields
  fullname: Stephens-Shields, Alisa
  organization: Department of Biostatistics, Epidemiology and Informatics, University of Pennsylvania, Philadelphia, PA, USA
– sequence: 4
  givenname: Marsha
  surname: Raebel
  fullname: Raebel, Marsha
  organization: Institute for Health Research, Kaiser Permanente Colorado, Aurora, CO, USA
– sequence: 5
  givenname: Jason A
  surname: Roy
  fullname: Roy, Jason A
  organization: Department of Biostatistics and Epidemiology, Rutgers School of Public Health, New Brunswick, NJ, USA
BackLink https://www.ncbi.nlm.nih.gov/pubmed/33176615$$D View this record in MEDLINE/PubMed
BookMark eNo1j0tLAzEUhYMo9qF7V5I_MJrXJOlSiq1CwY2uSya5aSNpMmQyiP_eKerZnLP4zuWeBbpMOQFCd5Q8UKrUI1lJxjRhbAqt5OQCzalQqiGcixlaDMMnIUQRsbpGM86pkpK2c2Q2Y7I15GQitnEcKpSQDvgE9ZjdgH0uOOZ0CHV04cw4Uw3-CvWITd_HYM25i2vGEMHWklOw-AgmTkABm4sbbtCVN3GA2z9foo_N8_v6pdm9bV_XT7vGctnWxghjlFQOOCMtTL9pD8prLUgnJlnvmCTOkU555ifGUam1BmqZ4FxTz5bo_vduP3YncPu-hJMp3_v_rewHCoVXzA
CitedBy_id crossref_primary_10_1111_sjos_12765
crossref_primary_10_1016_j_ins_2022_05_112
crossref_primary_10_1098_rsta_2022_0145
crossref_primary_10_1002_asmb_2736
ContentType Journal Article
DBID NPM
DOI 10.1177/0962280220965630
DatabaseName PubMed
DatabaseTitle PubMed
DatabaseTitleList PubMed
Database_xml – sequence: 1
  dbid: NPM
  name: PubMed
  url: https://proxy.k.utb.cz/login?url=http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=PubMed
  sourceTypes: Index Database
DeliveryMethod no_fulltext_linktorsrc
Discipline Medicine
Statistics
Mathematics
EISSN 1477-0334
ExternalDocumentID 33176615
Genre Journal Article
GroupedDBID ---
-TM
.2G
.2J
.2N
0-V
01A
0R~
123
1~K
29Q
31S
31U
31X
31Y
31Z
36B
3V.
4.4
53G
54M
5RE
5VS
6PF
7X7
88E
88I
8C1
8FE
8FG
8FI
8FJ
8R4
8R5
AABMB
AABOD
AACKU
AACMV
AACTG
AADTT
AADUE
AAEWN
AAGGD
AAJIQ
AAJOX
AAJPV
AAMGE
AANSI
AAPEO
AAQDB
AAQXH
AAQXI
AARDL
AARIX
AATAA
AATBZ
AAWTL
AAYTG
ABAWP
ABCCA
ABCJG
ABDLQ
ABDWY
ABEIX
ABFWQ
ABHKI
ABHQH
ABJCF
ABJIS
ABKRH
ABLUO
ABPGX
ABPNF
ABQKF
ABQXT
ABRHV
ABTDE
ABUJY
ABUWG
ABVFX
ABVVC
ABYTW
ACARO
ACDSZ
ACDXX
ACFEJ
ACFMA
ACGBL
ACGFS
ACGOD
ACGZU
ACIWK
ACJER
ACLHI
ACLZU
ACOFE
ACOXC
ACROE
ACRPL
ACSBE
ACSIQ
ACTQU
ACUAV
ACUIR
ACXKE
ACXMB
ADBBV
ADEIA
ADNMO
ADNON
ADRRZ
ADTBJ
ADUKL
ADVBO
ADYCS
AECGH
AECVZ
AEDTQ
AENEX
AEPTA
AEQLS
AERKM
AESZF
AEUHG
AEUIJ
AEWDL
AEWHI
AEXNY
AFEET
AFKBI
AFKRA
AFKRG
AFMOU
AFQAA
AFUIA
AFWMB
AGKLV
AGNHF
AGWFA
AGWNL
AHDMH
AHHFK
AHMBA
AIOMO
AJEFB
AJMMQ
AJUZI
AJXAJ
ALIPV
ALKWR
ALMA_UNASSIGNED_HOLDINGS
ALSLI
AMCVQ
ANDLU
ARALO
ARTOV
ASOEW
ASPBG
AUTPY
AUVAJ
AVWKF
AYAKG
AZFZN
AZQEC
B8O
B8R
B8Z
B93
B94
BBRGL
BDDNI
BENPR
BGLVJ
BKIIM
BPACV
BPHCQ
BSEHC
BVXVI
BYIEH
C45
CAG
CBRKF
CCPQU
CFDXU
COF
CORYS
CQQTX
CS3
DC-
DD-
DD0
DE-
DF0
DO-
DOPDO
DU5
DV7
DWQXO
D~Y
EAD
EAP
EBS
EJD
EMB
EMK
EMOBN
ESX
F5P
FEDTE
FHBDP
FYUFA
GNUQQ
GROUPED_SAGE_PREMIER_JOURNAL_COLLECTION
H13
HCIFZ
HEHIP
HF~
HMCUK
HVGLF
HZ~
J8X
K.F
K.J
L6V
M1P
M2P
M2S
M4V
M7S
N9A
NPM
O9-
OVD
P.B
P2P
PQQKQ
PROAC
PSQYO
PTHSS
Q1R
Q2X
Q7K
Q7L
Q7X
Q82
Q83
RIG
ROL
S01
SAUOL
SCNPE
SDB
SFB
SFC
SFK
SFN
SFT
SGA
SGP
SGR
SGV
SGX
SGZ
SHG
SNB
SPJ
SPV
SQCSI
STM
SV3
TEORI
TN5
UKHRP
YHZ
ZONMY
ZPPRI
ZRKOI
ID FETCH-LOGICAL-c365t-a4aa767de3205e6158fe7f8840b4444cfd260dd0b7f2f320d16888e1c243381f2
IngestDate Wed Feb 19 02:28:23 EST 2025
IsPeerReviewed true
IsScholarly true
Issue 3
Keywords functional clustering
prediction
Bayesian nonparametrics
Dirichlet process
Outcome identification
Language English
LinkModel OpenURL
MergedId FETCHMERGED-LOGICAL-c365t-a4aa767de3205e6158fe7f8840b4444cfd260dd0b7f2f320d16888e1c243381f2
ORCID 0000-0002-3651-7365
PMID 33176615
ParticipantIDs pubmed_primary_33176615
PublicationCentury 2000
PublicationDate 2021-Mar
PublicationDateYYYYMMDD 2021-03-01
PublicationDate_xml – month: 03
  year: 2021
  text: 2021-Mar
PublicationDecade 2020
PublicationPlace England
PublicationPlace_xml – name: England
PublicationTitle Statistical methods in medical research
PublicationTitleAlternate Stat Methods Med Res
PublicationYear 2021
SSID ssj0007049
Score 2.3022668
Snippet We develop a method to estimate subject-level trajectory functions from longitudinal data. The approach can be used for patient phenotyping, feature...
SourceID pubmed
SourceType Index Database
StartPage 655
Title Functional clustering methods for longitudinal data with application to electronic health records
URI https://www.ncbi.nlm.nih.gov/pubmed/33176615
Volume 30
hasFullText
inHoldings 1
isFullTextHit
isPrint
link http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwnV3PT9swFLYKk1A5oK0w2NiQD9xQIHUS2z1uaBVCKgfUStxQEtuiUtYi0l74U_hrec92E6uAYOshqpzWTfO-2O_n9wg5HrDSZEmaRTHyzqZGsaiQTEfK9DMppOalxALn0RW_mKSXN9lNp_MUZC0tF8Vp-fhqXcn_SBXGQK5YJfsPkm0mhQF4D_KFI0gYjh-S8RA2Je_LK6slUh6g4e-aQluehZNqjv2Ilsr2vsJsUF_M1oatUfkMeuG4usgT57upQ9UV1VLL6mzrTdxPTGcuOG-7AwR-MfRE60q5uNHvh7bIeljNXdzeZuc27h2XalZH2Jm7Uq5BcjWtmy3jOgcUVL62qL7LQ18FC5K1TrVbX1MhYND7L_0C7AMz09A-t6spdwy-L1d5G2cG4wu5fBhDAhvuJgmEfv_XSj1JkAKzn71_do13e3Vqg2yABYItVdEP5Pd4AYZVG_Q-W7-ULtlafX3NXLFqy_gz2fH2Bv3lwPOFdPSsR7ZHDVlv3SNbI59f0SPdRsj1LslbeNEWXtTLngK8aAgvivCiCC8awIsu5rSFF3Xwoh5ee2Qy_DM-v4h8R46oTHi2iPI0zwUXSicszjT8O2m0MFKmcZHCq4RHncdKxYUwzMBnVJ9LKXW_ZGkCqqFhX8nmbD7TBwSuSQltwF4ZYBc8JmTMjYkHJgddCmaNv5F9d99u7x3tyu3qjn5_88wh6bao-0E-GXjO9U9QGhfFkZXfM8PYb9Q
linkProvider National Library of Medicine
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Functional+clustering+methods+for+longitudinal+data+with+application+to+electronic+health+records&rft.jtitle=Statistical+methods+in+medical+research&rft.au=Zeldow%2C+Bret&rft.au=Flory%2C+James&rft.au=Stephens-Shields%2C+Alisa&rft.au=Raebel%2C+Marsha&rft.date=2021-03-01&rft.eissn=1477-0334&rft.volume=30&rft.issue=3&rft.spage=655&rft_id=info:doi/10.1177%2F0962280220965630&rft_id=info%3Apmid%2F33176615&rft_id=info%3Apmid%2F33176615&rft.externalDocID=33176615