Self-supervised Representation Learning on Electronic Health Records with Graph Kernel Infomax

Learning Electronic Health Records (EHRs) representation is a preeminent yet under-discovered research topic. It benefits various clinical decision support applications, e.g., medication outcome prediction or patient similarity search. Current approaches focus on task-specific label supervision on v...

Full description

Saved in:
Bibliographic Details
Published inarXiv.org
Main Authors Hao-Ren, Yao, Cao, Nairen, Russell, Katina, Der-Chen, Chang, Ophir Frieder, Fineman, Jeremy
Format Paper Journal Article
LanguageEnglish
Published Ithaca Cornell University Library, arXiv.org 20.02.2024
Subjects
Online AccessGet full text

Cover

Loading…
Abstract Learning Electronic Health Records (EHRs) representation is a preeminent yet under-discovered research topic. It benefits various clinical decision support applications, e.g., medication outcome prediction or patient similarity search. Current approaches focus on task-specific label supervision on vectorized sequential EHR, which is not applicable to large-scale unsupervised scenarios. Recently, contrastive learning shows great success on self-supervised representation learning problems. However, complex temporality often degrades the performance. We propose Graph Kernel Infomax, a self-supervised graph kernel learning approach on the graphical representation of EHR, to overcome the previous problems. Unlike the state-of-the-art, we do not change the graph structure to construct augmented views. Instead, we use Kernel Subspace Augmentation to embed nodes into two geometrically different manifold views. The entire framework is trained by contrasting nodes and graph representations on those two manifold views through the commonly used contrastive objectives. Empirically, using publicly available benchmark EHR datasets, our approach yields performance on clinical downstream tasks that exceeds the state-of-the-art. Theoretically, the variation on distance metrics naturally creates different views as data augmentation without changing graph structures.
AbstractList Learning Electronic Health Records (EHRs) representation is a preeminent yet under-discovered research topic. It benefits various clinical decision support applications, e.g., medication outcome prediction or patient similarity search. Current approaches focus on task-specific label supervision on vectorized sequential EHR, which is not applicable to large-scale unsupervised scenarios. Recently, contrastive learning shows great success on self-supervised representation learning problems. However, complex temporality often degrades the performance. We propose Graph Kernel Infomax, a self-supervised graph kernel learning approach on the graphical representation of EHR, to overcome the previous problems. Unlike the state-of-the-art, we do not change the graph structure to construct augmented views. Instead, we use Kernel Subspace Augmentation to embed nodes into two geometrically different manifold views. The entire framework is trained by contrasting nodes and graph representations on those two manifold views through the commonly used contrastive objectives. Empirically, using publicly available benchmark EHR datasets, our approach yields performance on clinical downstream tasks that exceeds the state-of-the-art. Theoretically, the variation on distance metrics naturally creates different views as data augmentation without changing graph structures.
Learning Electronic Health Records (EHRs) representation is a preeminent yet under-discovered research topic. It benefits various clinical decision support applications, e.g., medication outcome prediction or patient similarity search. Current approaches focus on task-specific label supervision on vectorized sequential EHR, which is not applicable to large-scale unsupervised scenarios. Recently, contrastive learning shows great success on self-supervised representation learning problems. However, complex temporality often degrades the performance. We propose Graph Kernel Infomax, a self-supervised graph kernel learning approach on the graphical representation of EHR, to overcome the previous problems. Unlike the state-of-the-art, we do not change the graph structure to construct augmented views. Instead, we use Kernel Subspace Augmentation to embed nodes into two geometrically different manifold views. The entire framework is trained by contrasting nodes and graph representations on those two manifold views through the commonly used contrastive objectives. Empirically, using publicly available benchmark EHR datasets, our approach yields performance on clinical downstream tasks that exceeds the state-of-the-art. Theoretically, the variation on distance metrics naturally creates different views as data augmentation without changing graph structures.
Author Cao, Nairen
Russell, Katina
Hao-Ren, Yao
Ophir Frieder
Der-Chen, Chang
Fineman, Jeremy
Author_xml – sequence: 1
  givenname: Yao
  surname: Hao-Ren
  fullname: Hao-Ren, Yao
– sequence: 2
  givenname: Nairen
  surname: Cao
  fullname: Cao, Nairen
– sequence: 3
  givenname: Katina
  surname: Russell
  fullname: Russell, Katina
– sequence: 4
  givenname: Chang
  surname: Der-Chen
  fullname: Der-Chen, Chang
– sequence: 5
  fullname: Ophir Frieder
– sequence: 6
  givenname: Jeremy
  surname: Fineman
  fullname: Fineman, Jeremy
BackLink https://doi.org/10.48550/arXiv.2209.00655$$DView paper in arXiv
https://doi.org/10.1145/3648695$$DView published paper (Access to full text may be restricted)
BookMark eNotkFFPwjAUhRujiYj8AJ9c4vNm167r-mgIApHERHl2uZRbKRndbAfiv7eCT-fe3C8355wbculah4Tc5TQrKiHoI_ijPWSMUZVRWgpxQQaM8zytCsauySiELaWUlZIJwQfk4x0bk4Z9h_5gA66TN-w8BnQ99LZ1yQLBO-s-kzhPGtS9b53VyQyh6TcR1q1fh-TbxmXqodskL-gdNsncmXYHx1tyZaAJOPrXIVk-T5bjWbp4nc7HT4sUlBApVIyakpdSKmkEVFCV0SDkBYACnRsuDaqVBrbmBYW8UnqFQkulkcVbIfiQ3J_fnsLXnbc78D_1Xwn1qYRIPJyJzrdfewx9vW333kVPNZNUVVRE4b_ybGH3
ContentType Paper
Journal Article
Copyright 2024. This work is published under http://creativecommons.org/licenses/by/4.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.
http://creativecommons.org/licenses/by/4.0
Copyright_xml – notice: 2024. This work is published under http://creativecommons.org/licenses/by/4.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.
– notice: http://creativecommons.org/licenses/by/4.0
DBID 8FE
8FG
ABJCF
ABUWG
AFKRA
AZQEC
BENPR
BGLVJ
CCPQU
DWQXO
HCIFZ
L6V
M7S
PIMPY
PQEST
PQQKQ
PQUKI
PRINS
PTHSS
AKY
GOX
DOI 10.48550/arxiv.2209.00655
DatabaseName ProQuest SciTech Collection
ProQuest Technology Collection
Materials Science & Engineering Collection
ProQuest Central (Alumni)
ProQuest Central
ProQuest Central Essentials
ProQuest Central
Technology Collection
ProQuest One Community College
ProQuest Central
SciTech Premium Collection (Proquest) (PQ_SDU_P3)
ProQuest Engineering Collection
Engineering Database
Publicly Available Content Database
ProQuest One Academic Eastern Edition (DO NOT USE)
ProQuest One Academic
ProQuest One Academic UKI Edition
ProQuest Central China
Engineering Collection
arXiv Computer Science
arXiv.org
DatabaseTitle Publicly Available Content Database
Engineering Database
Technology Collection
ProQuest Central Essentials
ProQuest One Academic Eastern Edition
ProQuest Central (Alumni Edition)
SciTech Premium Collection
ProQuest One Community College
ProQuest Technology Collection
ProQuest SciTech Collection
ProQuest Central China
ProQuest Central
ProQuest Engineering Collection
ProQuest One Academic UKI Edition
ProQuest Central Korea
Materials Science & Engineering Collection
ProQuest One Academic
Engineering Collection
DatabaseTitleList
Publicly Available Content Database
Database_xml – sequence: 1
  dbid: GOX
  name: arXiv.org
  url: http://arxiv.org/find
  sourceTypes: Open Access Repository
– sequence: 2
  dbid: 8FG
  name: ProQuest Technology Collection
  url: https://search.proquest.com/technologycollection1
  sourceTypes: Aggregation Database
DeliveryMethod fulltext_linktorsrc
Discipline Physics
EISSN 2331-8422
ExternalDocumentID 2209_00655
Genre Working Paper/Pre-Print
GroupedDBID 8FE
8FG
ABJCF
ABUWG
AFKRA
ALMA_UNASSIGNED_HOLDINGS
AZQEC
BENPR
BGLVJ
CCPQU
DWQXO
FRJ
HCIFZ
L6V
M7S
M~E
PIMPY
PQEST
PQQKQ
PQUKI
PRINS
PTHSS
AKY
GOX
ID FETCH-LOGICAL-a955-a820f6367797f5a8a86267a14aa9ac1f37fe9bca2d340a189cbe5c79ce2f37453
IEDL.DBID 8FG
IngestDate Fri Feb 23 12:13:29 EST 2024
Thu Oct 10 16:02:06 EDT 2024
IsDoiOpenAccess true
IsOpenAccess true
IsPeerReviewed false
IsScholarly false
Language English
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-a955-a820f6367797f5a8a86267a14aa9ac1f37fe9bca2d340a189cbe5c79ce2f37453
OpenAccessLink https://www.proquest.com/docview/2709805270?pq-origsite=%requestingapplication%
PQID 2709805270
PQPubID 2050157
ParticipantIDs arxiv_primary_2209_00655
proquest_journals_2709805270
PublicationCentury 2000
PublicationDate 20240220
PublicationDateYYYYMMDD 2024-02-20
PublicationDate_xml – month: 02
  year: 2024
  text: 20240220
  day: 20
PublicationDecade 2020
PublicationPlace Ithaca
PublicationPlace_xml – name: Ithaca
PublicationTitle arXiv.org
PublicationYear 2024
Publisher Cornell University Library, arXiv.org
Publisher_xml – name: Cornell University Library, arXiv.org
SSID ssj0002672553
Score 1.9096736
SecondaryResourceType preprint
Snippet Learning Electronic Health Records (EHRs) representation is a preeminent yet under-discovered research topic. It benefits various clinical decision support...
Learning Electronic Health Records (EHRs) representation is a preeminent yet under-discovered research topic. It benefits various clinical decision support...
SourceID arxiv
proquest
SourceType Open Access Repository
Aggregation Database
SubjectTerms Computer Science - Computers and Society
Computer Science - Learning
Data augmentation
Electronic health records
Graph representations
Graphical representations
Kernels
Learning
Manifolds
Nodes
Performance degradation
SummonAdditionalLinks – databaseName: arXiv.org
  dbid: GOX
  link: http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwdV1LSwMxEA5tT15EUWm1Sg5eF7d5bJKjSB8oKGiFnlwmLxHqWrqt9OebZFc9iLeQTC4zyTySmW8QuvSSO5sTkwktIWOe-aAHI7y2oYIxkcdmDzHb4r6YPbPbBV90EP6uhYH17u2zwQfW9RUhCU6y4LyLuoTElK3pw6L5nExQXC39L13wMdPUH9Wa7MXkAO23jh6-biRziDquOkIvT27ps3q7ije0dhY_pkTUtv6nwi3a6SsO4_FPgxrc1ArhJlSscXw7xdOINI3v3LpySxz_jt5hd4zmk_H8Zpa1HQ4yUJxnEMyvL2ghhBKeg4QYXggYMQAFZuSp8E5pA8RSlsNIKqMdN0IZR8Ia4_QE9aqPyvURjulKTBfCyMIya5h0hlBOOTAtqJV2gPqJL-WqAbEoI8vKxLIBGn6zqmwPcF0SkavY7UDkp__vPEN7JNj4VOGdD1Fvs96682CjN_oiCeoLjruRxA
  priority: 102
  providerName: Cornell University
Title Self-supervised Representation Learning on Electronic Health Records with Graph Kernel Infomax
URI https://www.proquest.com/docview/2709805270
https://arxiv.org/abs/2209.00655
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
link http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwfV3PT4MwFG50xMSbPzN1Lj14xUF_0HIy0bAtGucyZ7KTpJRiTCZD2MxO_u22hc2DiRcC9MRrea9973vfB8BVxqlKPSRdlnDhkoxk2g8aem2JGSHMM2IPBm0xCoYv5H5GZ03CrWpglRufaB11upAmR95DzAsN_T7zbopP16hGmepqI6GxCxwfMWZWNe8PtjkWFDC9Y8Z1MdNSd_VEuX7_ukbI0lQGpsHPsa_-uGIbX_oHwBmLQpWHYEflR2DPwjJldQxen9U8c6tVYf7oSqVwYoGrTb9QDht21Deo76OtoA2se4tgfbSsoMm1woFhpoYPqszVHJpa04dYn4BpP5reDd1GEcEVIaWu0OE6C3Cgv5RlVHBhjiNM-ESIUEg_wyxTYSIFSjHxhM9DmSgqWSgV0mOE4lPQyhe5agNo4E0kCZjkQUpSSbiSCFNMBUkYTnl6BtrWLnFRk17ExmSxNdkZ6GxMFTcLvop_p-f8_-ELsI_0vsB2hXsd0FqWK3Wp4_oy6drJ6wLnNhqNJ_pp8DTT18fv6AcnWaZD
link.rule.ids 228,230,783,787,888,12779,21402,27939,33387,33758,43614,43819
linkProvider ProQuest
linkToHtml http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwfV07T8MwELagFYKNpygU8MAamvoRJxMD6gNaKgRF6kTk2A5CKmlIWtSfj89Ny4DEFsXbObmz774HQtdpyI32ifJEEkqPpSy1eRDktRUVjAkfzB4AbTEK-q_sYcInVcOtrGCV65zoErWeKeiRt4jwI5DfF_5t_uWBaxRMVysLjW1UZ9QWGmCKd3ubHgsJhD0x09Uw00l3tWSx_Pi-IcTJVAZA8Ku7V39Ssasv3X1Uf5K5KQ7QlskO0Y6DZaryCL29mGnqlYsc_ujSaPzsgKsVXyjDlTrqO7bPnY2hDV5xi_Dqalli6LXiHihT44EpMjPFMGv6lMtjNO52xnd9r3JE8GTEuSdtuU4DGggRiZTLUMJ1RMg2kzKSqp1SkZooUZJoynzZDiOVGK5EpAyxa4zTE1TLZpk5RRjgTSwJhAoDzbRioVGEcsolSwTVoW6gUxeXOF-JXsQQstiFrIGa61DF1Qdfxr_bc_b_8hXa7Y8fh_HwfjQ4R3vEnhEcQ9xvotq8WJgLW-PnyaXbyB8DFqVL
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Self-supervised+Representation+Learning+on+Electronic+Health+Records+with+Graph+Kernel+Infomax&rft.jtitle=arXiv.org&rft.au=Hao-Ren%2C+Yao&rft.au=Cao%2C+Nairen&rft.au=Russell%2C+Katina&rft.au=Der-Chen%2C+Chang&rft.date=2024-02-20&rft.pub=Cornell+University+Library%2C+arXiv.org&rft.eissn=2331-8422&rft_id=info:doi/10.48550%2Farxiv.2209.00655