Self-supervised Representation Learning on Electronic Health Records with Graph Kernel Infomax
Learning Electronic Health Records (EHRs) representation is a preeminent yet under-discovered research topic. It benefits various clinical decision support applications, e.g., medication outcome prediction or patient similarity search. Current approaches focus on task-specific label supervision on v...
Saved in:
Published in | arXiv.org |
---|---|
Main Authors | , , , , , |
Format | Paper Journal Article |
Language | English |
Published |
Ithaca
Cornell University Library, arXiv.org
20.02.2024
|
Subjects | |
Online Access | Get full text |
Cover
Loading…
Abstract | Learning Electronic Health Records (EHRs) representation is a preeminent yet under-discovered research topic. It benefits various clinical decision support applications, e.g., medication outcome prediction or patient similarity search. Current approaches focus on task-specific label supervision on vectorized sequential EHR, which is not applicable to large-scale unsupervised scenarios. Recently, contrastive learning shows great success on self-supervised representation learning problems. However, complex temporality often degrades the performance. We propose Graph Kernel Infomax, a self-supervised graph kernel learning approach on the graphical representation of EHR, to overcome the previous problems. Unlike the state-of-the-art, we do not change the graph structure to construct augmented views. Instead, we use Kernel Subspace Augmentation to embed nodes into two geometrically different manifold views. The entire framework is trained by contrasting nodes and graph representations on those two manifold views through the commonly used contrastive objectives. Empirically, using publicly available benchmark EHR datasets, our approach yields performance on clinical downstream tasks that exceeds the state-of-the-art. Theoretically, the variation on distance metrics naturally creates different views as data augmentation without changing graph structures. |
---|---|
AbstractList | Learning Electronic Health Records (EHRs) representation is a preeminent yet
under-discovered research topic. It benefits various clinical decision support
applications, e.g., medication outcome prediction or patient similarity search.
Current approaches focus on task-specific label supervision on vectorized
sequential EHR, which is not applicable to large-scale unsupervised scenarios.
Recently, contrastive learning shows great success on self-supervised
representation learning problems. However, complex temporality often degrades
the performance. We propose Graph Kernel Infomax, a self-supervised graph
kernel learning approach on the graphical representation of EHR, to overcome
the previous problems. Unlike the state-of-the-art, we do not change the graph
structure to construct augmented views. Instead, we use Kernel Subspace
Augmentation to embed nodes into two geometrically different manifold views.
The entire framework is trained by contrasting nodes and graph representations
on those two manifold views through the commonly used contrastive objectives.
Empirically, using publicly available benchmark EHR datasets, our approach
yields performance on clinical downstream tasks that exceeds the
state-of-the-art. Theoretically, the variation on distance metrics naturally
creates different views as data augmentation without changing graph structures. Learning Electronic Health Records (EHRs) representation is a preeminent yet under-discovered research topic. It benefits various clinical decision support applications, e.g., medication outcome prediction or patient similarity search. Current approaches focus on task-specific label supervision on vectorized sequential EHR, which is not applicable to large-scale unsupervised scenarios. Recently, contrastive learning shows great success on self-supervised representation learning problems. However, complex temporality often degrades the performance. We propose Graph Kernel Infomax, a self-supervised graph kernel learning approach on the graphical representation of EHR, to overcome the previous problems. Unlike the state-of-the-art, we do not change the graph structure to construct augmented views. Instead, we use Kernel Subspace Augmentation to embed nodes into two geometrically different manifold views. The entire framework is trained by contrasting nodes and graph representations on those two manifold views through the commonly used contrastive objectives. Empirically, using publicly available benchmark EHR datasets, our approach yields performance on clinical downstream tasks that exceeds the state-of-the-art. Theoretically, the variation on distance metrics naturally creates different views as data augmentation without changing graph structures. |
Author | Cao, Nairen Russell, Katina Hao-Ren, Yao Ophir Frieder Der-Chen, Chang Fineman, Jeremy |
Author_xml | – sequence: 1 givenname: Yao surname: Hao-Ren fullname: Hao-Ren, Yao – sequence: 2 givenname: Nairen surname: Cao fullname: Cao, Nairen – sequence: 3 givenname: Katina surname: Russell fullname: Russell, Katina – sequence: 4 givenname: Chang surname: Der-Chen fullname: Der-Chen, Chang – sequence: 5 fullname: Ophir Frieder – sequence: 6 givenname: Jeremy surname: Fineman fullname: Fineman, Jeremy |
BackLink | https://doi.org/10.48550/arXiv.2209.00655$$DView paper in arXiv https://doi.org/10.1145/3648695$$DView published paper (Access to full text may be restricted) |
BookMark | eNotkFFPwjAUhRujiYj8AJ9c4vNm167r-mgIApHERHl2uZRbKRndbAfiv7eCT-fe3C8355wbculah4Tc5TQrKiHoI_ijPWSMUZVRWgpxQQaM8zytCsauySiELaWUlZIJwQfk4x0bk4Z9h_5gA66TN-w8BnQ99LZ1yQLBO-s-kzhPGtS9b53VyQyh6TcR1q1fh-TbxmXqodskL-gdNsncmXYHx1tyZaAJOPrXIVk-T5bjWbp4nc7HT4sUlBApVIyakpdSKmkEVFCV0SDkBYACnRsuDaqVBrbmBYW8UnqFQkulkcVbIfiQ3J_fnsLXnbc78D_1Xwn1qYRIPJyJzrdfewx9vW333kVPNZNUVVRE4b_ybGH3 |
ContentType | Paper Journal Article |
Copyright | 2024. This work is published under http://creativecommons.org/licenses/by/4.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License. http://creativecommons.org/licenses/by/4.0 |
Copyright_xml | – notice: 2024. This work is published under http://creativecommons.org/licenses/by/4.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License. – notice: http://creativecommons.org/licenses/by/4.0 |
DBID | 8FE 8FG ABJCF ABUWG AFKRA AZQEC BENPR BGLVJ CCPQU DWQXO HCIFZ L6V M7S PIMPY PQEST PQQKQ PQUKI PRINS PTHSS AKY GOX |
DOI | 10.48550/arxiv.2209.00655 |
DatabaseName | ProQuest SciTech Collection ProQuest Technology Collection Materials Science & Engineering Collection ProQuest Central (Alumni) ProQuest Central ProQuest Central Essentials ProQuest Central Technology Collection ProQuest One Community College ProQuest Central SciTech Premium Collection (Proquest) (PQ_SDU_P3) ProQuest Engineering Collection Engineering Database Publicly Available Content Database ProQuest One Academic Eastern Edition (DO NOT USE) ProQuest One Academic ProQuest One Academic UKI Edition ProQuest Central China Engineering Collection arXiv Computer Science arXiv.org |
DatabaseTitle | Publicly Available Content Database Engineering Database Technology Collection ProQuest Central Essentials ProQuest One Academic Eastern Edition ProQuest Central (Alumni Edition) SciTech Premium Collection ProQuest One Community College ProQuest Technology Collection ProQuest SciTech Collection ProQuest Central China ProQuest Central ProQuest Engineering Collection ProQuest One Academic UKI Edition ProQuest Central Korea Materials Science & Engineering Collection ProQuest One Academic Engineering Collection |
DatabaseTitleList | Publicly Available Content Database |
Database_xml | – sequence: 1 dbid: GOX name: arXiv.org url: http://arxiv.org/find sourceTypes: Open Access Repository – sequence: 2 dbid: 8FG name: ProQuest Technology Collection url: https://search.proquest.com/technologycollection1 sourceTypes: Aggregation Database |
DeliveryMethod | fulltext_linktorsrc |
Discipline | Physics |
EISSN | 2331-8422 |
ExternalDocumentID | 2209_00655 |
Genre | Working Paper/Pre-Print |
GroupedDBID | 8FE 8FG ABJCF ABUWG AFKRA ALMA_UNASSIGNED_HOLDINGS AZQEC BENPR BGLVJ CCPQU DWQXO FRJ HCIFZ L6V M7S M~E PIMPY PQEST PQQKQ PQUKI PRINS PTHSS AKY GOX |
ID | FETCH-LOGICAL-a955-a820f6367797f5a8a86267a14aa9ac1f37fe9bca2d340a189cbe5c79ce2f37453 |
IEDL.DBID | 8FG |
IngestDate | Fri Feb 23 12:13:29 EST 2024 Thu Oct 10 16:02:06 EDT 2024 |
IsDoiOpenAccess | true |
IsOpenAccess | true |
IsPeerReviewed | false |
IsScholarly | false |
Language | English |
LinkModel | DirectLink |
MergedId | FETCHMERGED-LOGICAL-a955-a820f6367797f5a8a86267a14aa9ac1f37fe9bca2d340a189cbe5c79ce2f37453 |
OpenAccessLink | https://www.proquest.com/docview/2709805270?pq-origsite=%requestingapplication% |
PQID | 2709805270 |
PQPubID | 2050157 |
ParticipantIDs | arxiv_primary_2209_00655 proquest_journals_2709805270 |
PublicationCentury | 2000 |
PublicationDate | 20240220 |
PublicationDateYYYYMMDD | 2024-02-20 |
PublicationDate_xml | – month: 02 year: 2024 text: 20240220 day: 20 |
PublicationDecade | 2020 |
PublicationPlace | Ithaca |
PublicationPlace_xml | – name: Ithaca |
PublicationTitle | arXiv.org |
PublicationYear | 2024 |
Publisher | Cornell University Library, arXiv.org |
Publisher_xml | – name: Cornell University Library, arXiv.org |
SSID | ssj0002672553 |
Score | 1.9096736 |
SecondaryResourceType | preprint |
Snippet | Learning Electronic Health Records (EHRs) representation is a preeminent yet under-discovered research topic. It benefits various clinical decision support... Learning Electronic Health Records (EHRs) representation is a preeminent yet under-discovered research topic. It benefits various clinical decision support... |
SourceID | arxiv proquest |
SourceType | Open Access Repository Aggregation Database |
SubjectTerms | Computer Science - Computers and Society Computer Science - Learning Data augmentation Electronic health records Graph representations Graphical representations Kernels Learning Manifolds Nodes Performance degradation |
SummonAdditionalLinks | – databaseName: arXiv.org dbid: GOX link: http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwdV1LSwMxEA5tT15EUWm1Sg5eF7d5bJKjSB8oKGiFnlwmLxHqWrqt9OebZFc9iLeQTC4zyTySmW8QuvSSO5sTkwktIWOe-aAHI7y2oYIxkcdmDzHb4r6YPbPbBV90EP6uhYH17u2zwQfW9RUhCU6y4LyLuoTElK3pw6L5nExQXC39L13wMdPUH9Wa7MXkAO23jh6-biRziDquOkIvT27ps3q7ije0dhY_pkTUtv6nwi3a6SsO4_FPgxrc1ArhJlSscXw7xdOINI3v3LpySxz_jt5hd4zmk_H8Zpa1HQ4yUJxnEMyvL2ghhBKeg4QYXggYMQAFZuSp8E5pA8RSlsNIKqMdN0IZR8Ia4_QE9aqPyvURjulKTBfCyMIya5h0hlBOOTAtqJV2gPqJL-WqAbEoI8vKxLIBGn6zqmwPcF0SkavY7UDkp__vPEN7JNj4VOGdD1Fvs96682CjN_oiCeoLjruRxA priority: 102 providerName: Cornell University |
Title | Self-supervised Representation Learning on Electronic Health Records with Graph Kernel Infomax |
URI | https://www.proquest.com/docview/2709805270 https://arxiv.org/abs/2209.00655 |
hasFullText | 1 |
inHoldings | 1 |
isFullTextHit | |
isPrint | |
link | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwfV3PT4MwFG50xMSbPzN1Lj14xUF_0HIy0bAtGucyZ7KTpJRiTCZD2MxO_u22hc2DiRcC9MRrea9973vfB8BVxqlKPSRdlnDhkoxk2g8aem2JGSHMM2IPBm0xCoYv5H5GZ03CrWpglRufaB11upAmR95DzAsN_T7zbopP16hGmepqI6GxCxwfMWZWNe8PtjkWFDC9Y8Z1MdNSd_VEuX7_ukbI0lQGpsHPsa_-uGIbX_oHwBmLQpWHYEflR2DPwjJldQxen9U8c6tVYf7oSqVwYoGrTb9QDht21Deo76OtoA2se4tgfbSsoMm1woFhpoYPqszVHJpa04dYn4BpP5reDd1GEcEVIaWu0OE6C3Cgv5RlVHBhjiNM-ESIUEg_wyxTYSIFSjHxhM9DmSgqWSgV0mOE4lPQyhe5agNo4E0kCZjkQUpSSbiSCFNMBUkYTnl6BtrWLnFRk17ExmSxNdkZ6GxMFTcLvop_p-f8_-ELsI_0vsB2hXsd0FqWK3Wp4_oy6drJ6wLnNhqNJ_pp8DTT18fv6AcnWaZD |
link.rule.ids | 228,230,783,787,888,12779,21402,27939,33387,33758,43614,43819 |
linkProvider | ProQuest |
linkToHtml | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwfV07T8MwELagFYKNpygU8MAamvoRJxMD6gNaKgRF6kTk2A5CKmlIWtSfj89Ny4DEFsXbObmz774HQtdpyI32ifJEEkqPpSy1eRDktRUVjAkfzB4AbTEK-q_sYcInVcOtrGCV65zoErWeKeiRt4jwI5DfF_5t_uWBaxRMVysLjW1UZ9QWGmCKd3ubHgsJhD0x09Uw00l3tWSx_Pi-IcTJVAZA8Ku7V39Ssasv3X1Uf5K5KQ7QlskO0Y6DZaryCL29mGnqlYsc_ujSaPzsgKsVXyjDlTrqO7bPnY2hDV5xi_Dqalli6LXiHihT44EpMjPFMGv6lMtjNO52xnd9r3JE8GTEuSdtuU4DGggRiZTLUMJ1RMg2kzKSqp1SkZooUZJoynzZDiOVGK5EpAyxa4zTE1TLZpk5RRjgTSwJhAoDzbRioVGEcsolSwTVoW6gUxeXOF-JXsQQstiFrIGa61DF1Qdfxr_bc_b_8hXa7Y8fh_HwfjQ4R3vEnhEcQ9xvotq8WJgLW-PnyaXbyB8DFqVL |
openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Self-supervised+Representation+Learning+on+Electronic+Health+Records+with+Graph+Kernel+Infomax&rft.jtitle=arXiv.org&rft.au=Hao-Ren%2C+Yao&rft.au=Cao%2C+Nairen&rft.au=Russell%2C+Katina&rft.au=Der-Chen%2C+Chang&rft.date=2024-02-20&rft.pub=Cornell+University+Library%2C+arXiv.org&rft.eissn=2331-8422&rft_id=info:doi/10.48550%2Farxiv.2209.00655 |