Clinical Information Extraction with Large Language Models: A Case Study on Organ Procurement
Recent work has demonstrated that large language models (LLMs) are powerful tools for clinical information extraction from unstructured text. However, existing approaches have largely ignored the extraction of numeric information such as laboratory tests and vital signs. In this article, we present...
Saved in:
Published in | AMIA ... Annual Symposium proceedings Vol. 2024; p. 115 |
---|---|
Main Authors | , , , , , |
Format | Journal Article |
Language | English |
Published |
United States
2024
|
Subjects | |
Online Access | Get full text |
ISSN | 1942-597X 1559-4076 |
Cover
Loading…
Abstract | Recent work has demonstrated that large language models (LLMs) are powerful tools for clinical information extraction from unstructured text. However, existing approaches have largely ignored the extraction of numeric information such as laboratory tests and vital signs. In this article, we present a case study on organ procurement that evaluates the ability of LLMs to extract numeric data from clinical text. We first describe our LLM-based approach, introducing a prompting strategy for numeric extraction and novel heuristics to combat hallucination. We validate our approach on a hand-annotated set of 298 notes, demonstrating that it has high accuracy, precision and recall. We then highlight the value of our approach for downstream data analysis using a corpus of 43,719 notes on 14,342 potential organ donors. This case study is a key component of an ongoing collaboration that aims to make data on organ procurement publicly available for informatics research. |
---|---|
AbstractList | Recent work has demonstrated that large language models (LLMs) are powerful tools for clinical information extraction from unstructured text. However, existing approaches have largely ignored the extraction of numeric information such as laboratory tests and vital signs. In this article, we present a case study on organ procurement that evaluates the ability of LLMs to extract numeric data from clinical text. We first describe our LLM-based approach, introducing a prompting strategy for numeric extraction and novel heuristics to combat hallucination. We validate our approach on a hand-annotated set of 298 notes, demonstrating that it has high accuracy, precision and recall. We then highlight the value of our approach for downstream data analysis using a corpus of 43,719 notes on 14,342 potential organ donors. This case study is a key component of an ongoing collaboration that aims to make data on organ procurement publicly available for informatics research. Recent work has demonstrated that large language models (LLMs) are powerful tools for clinical information extraction from unstructured text. However, existing approaches have largely ignored the extraction of numeric information such as laboratory tests and vital signs. In this article, we present a case study on organ procurement that evaluates the ability of LLMs to extract numeric data from clinical text. We first describe our LLM-based approach, introducing a prompting strategy for numeric extraction and novel heuristics to combat hallucination. We validate our approach on a hand-annotated set of 298 notes, demonstrating that it has high accuracy, precision and recall. We then highlight the value of our approach for downstream data analysis using a corpus of 43,719 notes on 14,342 potential organ donors. This case study is a key component of an ongoing collaboration that aims to make data on organ procurement publicly available for informatics research.Recent work has demonstrated that large language models (LLMs) are powerful tools for clinical information extraction from unstructured text. However, existing approaches have largely ignored the extraction of numeric information such as laboratory tests and vital signs. In this article, we present a case study on organ procurement that evaluates the ability of LLMs to extract numeric data from clinical text. We first describe our LLM-based approach, introducing a prompting strategy for numeric extraction and novel heuristics to combat hallucination. We validate our approach on a hand-annotated set of 298 notes, demonstrating that it has high accuracy, precision and recall. We then highlight the value of our approach for downstream data analysis using a corpus of 43,719 notes on 14,342 potential organ donors. This case study is a key component of an ongoing collaboration that aims to make data on organ procurement publicly available for informatics research. |
Author | Adam, Hammaad Ghassemi, Marzyeh Lin, Junjing Keenan, Hillary Wilson, Ashia Lin, Jianchang |
Author_xml | – sequence: 1 givenname: Hammaad surname: Adam fullname: Adam, Hammaad organization: Massachusetts Institute of Technology, Cambridge, MA, USA – sequence: 2 givenname: Junjing surname: Lin fullname: Lin, Junjing organization: Takeda Pharmaceuticals, Cambridge, MA, USA – sequence: 3 givenname: Jianchang surname: Lin fullname: Lin, Jianchang organization: Takeda Pharmaceuticals, Cambridge, MA, USA – sequence: 4 givenname: Hillary surname: Keenan fullname: Keenan, Hillary organization: Takeda Pharmaceuticals, Cambridge, MA, USA – sequence: 5 givenname: Ashia surname: Wilson fullname: Wilson, Ashia organization: Massachusetts Institute of Technology, Cambridge, MA, USA – sequence: 6 givenname: Marzyeh surname: Ghassemi fullname: Ghassemi, Marzyeh organization: Vector Institute, Toronto, Ontario, Canada |
BackLink | https://www.ncbi.nlm.nih.gov/pubmed/40417525$$D View this record in MEDLINE/PubMed |
BookMark | eNo1kF1LwzAUhoNM3If-BcmlN4UkzUfr3SjTDSYTVPBGSr46K206kxTdvzfovDnvA-fhwHnnYOIGZ8_ADDNWZhQJPklcUpKxUrxOwTyED4SoYAW_AFOKKBaMsBl4q7rWtVp2cOOawfcytoODq-_opf7Frza-w630e5um248ywcNgbBdu4RJWMlj4FEdzhMnd-b108NEPevS2ty5egvNGdsFenXIBXu5Wz9U62-7uN9Vymx0wKWLGS1uUyCgleUEEFwbThgqpDSY6hSINVjqtNE7PCK0KxJThEhW6EWVe4HwBbv7uHvzwOdoQ674N2naddHYYQ50TJATKOedJvT6po-qtqQ--7aU_1v-V5D8dkF_x |
ContentType | Journal Article |
Copyright | 2024 AMIA - All rights reserved. |
Copyright_xml | – notice: 2024 AMIA - All rights reserved. |
DBID | CGR CUY CVF ECM EIF NPM 7X8 |
DatabaseName | Medline MEDLINE MEDLINE (Ovid) MEDLINE MEDLINE PubMed MEDLINE - Academic |
DatabaseTitle | MEDLINE Medline Complete MEDLINE with Full Text PubMed MEDLINE (Ovid) MEDLINE - Academic |
DatabaseTitleList | MEDLINE MEDLINE - Academic |
Database_xml | – sequence: 1 dbid: NPM name: PubMed url: https://proxy.k.utb.cz/login?url=http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=PubMed sourceTypes: Index Database – sequence: 2 dbid: EIF name: MEDLINE url: https://proxy.k.utb.cz/login?url=https://www.webofscience.com/wos/medline/basic-search sourceTypes: Index Database |
DeliveryMethod | fulltext_linktorsrc |
Discipline | Medicine |
EISSN | 1559-4076 |
ExternalDocumentID | 40417525 |
Genre | Journal Article |
GroupedDBID | 2WC 53G ADBBV ALMA_UNASSIGNED_HOLDINGS BAWUL CGR CUY CVF DIK E3Z ECM EIF GX1 HYE NPM OK1 RPM WOQ 7X8 |
ID | FETCH-LOGICAL-p128t-69e890dbba682767d14f47acd12c7acb2f1bc682c15977cb805bd6a08cf793813 |
ISSN | 1942-597X |
IngestDate | Mon May 26 17:05:14 EDT 2025 Thu May 29 04:59:28 EDT 2025 |
IsPeerReviewed | true |
IsScholarly | true |
Language | English |
License | 2024 AMIA - All rights reserved. |
LinkModel | OpenURL |
MergedId | FETCHMERGED-LOGICAL-p128t-69e890dbba682767d14f47acd12c7acb2f1bc682c15977cb805bd6a08cf793813 |
Notes | ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 23 |
PMID | 40417525 |
PQID | 3207703666 |
PQPubID | 23479 |
ParticipantIDs | proquest_miscellaneous_3207703666 pubmed_primary_40417525 |
PublicationCentury | 2000 |
PublicationDate | 2024-00-00 20240101 |
PublicationDateYYYYMMDD | 2024-01-01 |
PublicationDate_xml | – year: 2024 text: 2024-00-00 |
PublicationDecade | 2020 |
PublicationPlace | United States |
PublicationPlace_xml | – name: United States |
PublicationTitle | AMIA ... Annual Symposium proceedings |
PublicationTitleAlternate | AMIA Annu Symp Proc |
PublicationYear | 2024 |
SSID | ssj0047586 |
Score | 2.321852 |
Snippet | Recent work has demonstrated that large language models (LLMs) are powerful tools for clinical information extraction from unstructured text. However, existing... |
SourceID | proquest pubmed |
SourceType | Aggregation Database Index Database |
StartPage | 115 |
SubjectTerms | Humans Information Storage and Retrieval - methods Large Language Models Natural Language Processing Tissue and Organ Procurement |
Title | Clinical Information Extraction with Large Language Models: A Case Study on Organ Procurement |
URI | https://www.ncbi.nlm.nih.gov/pubmed/40417525 https://www.proquest.com/docview/3207703666 |
Volume | 2024 |
hasFullText | 1 |
inHoldings | 1 |
isFullTextHit | |
isPrint | |
link | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwnZ3Pa9swFMdF6WH0Uvaza7sNDXYLNpYty9JupnRkY9llLeQygmTLh7G4oXWg3V-_9_TDMaOFdYcoQRI-6BOkp-f3fY-QD7wqmJWySFTeioTLsk2U0NDIqlG8hI9GgfPim5hf8i_Lcrmr4ujUJYNJm9_36kr-hyr0AVdUyT6C7PhQ6IDfwBdaIAztPzE-i7LGICpyLM9vh-tQANw5Wb9irDe03i_pip_9ugmCdDjCXCThHb4zcLJMpxwITsOp4VovPtezNE1nISH_97s1xntt17PdETia53Xr_2boFte6HYN-QuWvbf8zHpiTXvSuoPd6dwTYUDt5jpWRQrhycFB4RXRqw3ZaKrih-govcb8d5_gtk3k55wTXZu148YyDbeNl0X_lxI5DmLQA9nTUeC_HyB4OVyCsTBUnPXx3cDbExVNyGIx_WnuSz8ie7Z-TJ4sQ3vCC_IhA6QQo3QGlCJQ6oDQCpR7oR1pTxEkdTgpzHU46wfmSXH46vzibJ6H-RbIBq2FIhLJSZa0xWsi8ElXLeMcr3bQsb-DL5B0zDQw1DJMINkZmpWmFzmTTwa4rWfGK7PdXvX1NaMetFpnmrFOSM9NpU2W6MoqrotBwZTkm7-MarWB_wZdGurdX25tVkWcVJmkT4pgc-cVbbXwilFVc4ZMHR07JAeL2Pqs3ZH-43tq3YMUN5p1j9gfdxkzJ |
linkProvider | Geneva Foundation for Medical Education and Research |
openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Clinical+Information+Extraction+with+Large+Language+Models%3A+A+Case+Study+on+Organ+Procurement&rft.jtitle=AMIA+...+Annual+Symposium+proceedings&rft.au=Adam%2C+Hammaad&rft.au=Lin%2C+Junjing&rft.au=Lin%2C+Jianchang&rft.au=Keenan%2C+Hillary&rft.date=2024&rft.eissn=1559-4076&rft.volume=2024&rft.spage=115&rft_id=info%3Apmid%2F40417525&rft.externalDocID=40417525 |
thumbnail_l | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=1942-597X&client=summon |
thumbnail_m | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=1942-597X&client=summon |
thumbnail_s | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=1942-597X&client=summon |