A Study of Biomedical Relation Extraction Using GPT Models
Relation Extraction (RE) is a natural language processing (NLP) task for extracting semantic relations between biomedical entities. Recent developments in pre-trained large language models (LLM) motivated NLP researchers to use them for various NLP tasks. We investigated GPT-3.5-turbo and GPT-4 on e...
Saved in:
Published in | AMIA Summits on Translational Science proceedings Vol. 2024; p. 391 |
---|---|
Main Authors | , , , , , , , , , |
Format | Journal Article |
Language | English |
Published |
United States
2024
|
Subjects | |
Online Access | Get full text |
ISSN | 2153-4063 2153-4063 |
Cover
Abstract | Relation Extraction (RE) is a natural language processing (NLP) task for extracting semantic relations between biomedical entities. Recent developments in pre-trained large language models (LLM) motivated NLP researchers to use them for various NLP tasks. We investigated GPT-3.5-turbo and GPT-4 on extracting the relations from three standard datasets, EU-ADR, Gene Associations Database (GAD), and ChemProt. Unlike the existing approaches using datasets with masked entities, we used three versions for each dataset for our experiment: a version with masked entities, a second version with the original entities (unmasked), and a third version with abbreviations replaced with the original terms. We developed the prompts for various versions and used the chat completion model from GPT API. Our approach achieved a F1-score of 0.498 to 0.809 for GPT-3.5-turbo, and a highest F1-score of 0.84 for GPT-4. For certain experiments, the performance of GPT, BioBERT, and PubMedBERT are almost the same. |
---|---|
AbstractList | Relation Extraction (RE) is a natural language processing (NLP) task for extracting semantic relations between biomedical entities. Recent developments in pre-trained large language models (LLM) motivated NLP researchers to use them for various NLP tasks. We investigated GPT-3.5-turbo and GPT-4 on extracting the relations from three standard datasets, EU-ADR, Gene Associations Database (GAD), and ChemProt. Unlike the existing approaches using datasets with masked entities, we used three versions for each dataset for our experiment: a version with masked entities, a second version with the original entities (unmasked), and a third version with abbreviations replaced with the original terms. We developed the prompts for various versions and used the chat completion model from GPT API. Our approach achieved a F1-score of 0.498 to 0.809 for GPT-3.5-turbo, and a highest F1-score of 0.84 for GPT-4. For certain experiments, the performance of GPT, BioBERT, and PubMedBERT are almost the same.Relation Extraction (RE) is a natural language processing (NLP) task for extracting semantic relations between biomedical entities. Recent developments in pre-trained large language models (LLM) motivated NLP researchers to use them for various NLP tasks. We investigated GPT-3.5-turbo and GPT-4 on extracting the relations from three standard datasets, EU-ADR, Gene Associations Database (GAD), and ChemProt. Unlike the existing approaches using datasets with masked entities, we used three versions for each dataset for our experiment: a version with masked entities, a second version with the original entities (unmasked), and a third version with abbreviations replaced with the original terms. We developed the prompts for various versions and used the chat completion model from GPT API. Our approach achieved a F1-score of 0.498 to 0.809 for GPT-3.5-turbo, and a highest F1-score of 0.84 for GPT-4. For certain experiments, the performance of GPT, BioBERT, and PubMedBERT are almost the same. Relation Extraction (RE) is a natural language processing (NLP) task for extracting semantic relations between biomedical entities. Recent developments in pre-trained large language models (LLM) motivated NLP researchers to use them for various NLP tasks. We investigated GPT-3.5-turbo and GPT-4 on extracting the relations from three standard datasets, EU-ADR, Gene Associations Database (GAD), and ChemProt. Unlike the existing approaches using datasets with masked entities, we used three versions for each dataset for our experiment: a version with masked entities, a second version with the original entities (unmasked), and a third version with abbreviations replaced with the original terms. We developed the prompts for various versions and used the chat completion model from GPT API. Our approach achieved a F1-score of 0.498 to 0.809 for GPT-3.5-turbo, and a highest F1-score of 0.84 for GPT-4. For certain experiments, the performance of GPT, BioBERT, and PubMedBERT are almost the same. |
Author | Wibert, Maxwell Keloth, Vipina K Zhou, Huixue Xu, Hua Peng, Xueqing Hu, Yan Raja, Kalpana Chen, Qingyu Zhang, Jeffrey Zhang, Rui |
Author_xml | – sequence: 1 givenname: Jeffrey surname: Zhang fullname: Zhang, Jeffrey organization: Section for Biomedical Informatics and Data Science, School of Medicine, Yale University, New Haven, USA – sequence: 2 givenname: Maxwell surname: Wibert fullname: Wibert, Maxwell organization: Section for Biomedical Informatics and Data Science, School of Medicine, Yale University, New Haven, USA – sequence: 3 givenname: Huixue surname: Zhou fullname: Zhou, Huixue organization: Institute for Health Informatics, University of Minnesota, Twin Cities, USA – sequence: 4 givenname: Xueqing surname: Peng fullname: Peng, Xueqing organization: Section for Biomedical Informatics and Data Science, School of Medicine, Yale University, New Haven, USA – sequence: 5 givenname: Qingyu surname: Chen fullname: Chen, Qingyu organization: Section for Biomedical Informatics and Data Science, School of Medicine, Yale University, New Haven, USA – sequence: 6 givenname: Vipina K surname: Keloth fullname: Keloth, Vipina K organization: Section for Biomedical Informatics and Data Science, School of Medicine, Yale University, New Haven, USA – sequence: 7 givenname: Yan surname: Hu fullname: Hu, Yan organization: School of Biomedical Informatics, University of Texas Health Science at Houston, Houston, USA – sequence: 8 givenname: Rui surname: Zhang fullname: Zhang, Rui organization: Department of Surgery, Minneapolis, School of Medicine, University of Minnesota, Minneapolis, USA – sequence: 9 givenname: Hua surname: Xu fullname: Xu, Hua organization: Section for Biomedical Informatics and Data Science, School of Medicine, Yale University, New Haven, USA – sequence: 10 givenname: Kalpana surname: Raja fullname: Raja, Kalpana organization: Section for Biomedical Informatics and Data Science, School of Medicine, Yale University, New Haven, USA |
BackLink | https://www.ncbi.nlm.nih.gov/pubmed/38827097$$D View this record in MEDLINE/PubMed |
BookMark | eNpNj89LwzAAhYNM3Jz7FyRHL4U0v-NtjjmFiaL1XNIklUja1KYF99-v6ATf5b3Dx4PvEsza2LozsMA5IxlFnMz-7TlYpfSJplDKFaMXYE6kxAIpsQC3a_g2jPYAYw3vfGyc9UYH-OqCHnxs4fZ76LX5me_Jtx9w91LAp2hdSFfgvNYhudWpl6C43xabh2z_vHvcrPdZx7jIhEFM5ZwKqWWNpbXYVhRhJDjTxBkpuaG54lQZRlRltHJaYY0FlTi3zhiyBDe_t10fv0aXhrLxybgQdOvimEqCOM2JEohO6PUJHavJpOx63-j-UP7pkiMpdFIj |
ContentType | Journal Article |
Copyright | 2024 AMIA - All rights reserved. |
Copyright_xml | – notice: 2024 AMIA - All rights reserved. |
DBID | NPM 7X8 |
DatabaseName | PubMed MEDLINE - Academic |
DatabaseTitle | PubMed MEDLINE - Academic |
DatabaseTitleList | MEDLINE - Academic PubMed |
Database_xml | – sequence: 1 dbid: NPM name: PubMed url: https://proxy.k.utb.cz/login?url=http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=PubMed sourceTypes: Index Database |
DeliveryMethod | fulltext_linktorsrc |
EISSN | 2153-4063 |
ExternalDocumentID | 38827097 |
Genre | Journal Article |
GrantInformation_xml | – fundername: NIA NIH HHS grantid: R01 AG078154 – fundername: NLM NIH HHS grantid: T15 LM007056 |
GroupedDBID | 53G ADBBV ADRAZ ALMA_UNASSIGNED_HOLDINGS AOIJS BAWUL DIK GX1 HYE KQ8 M48 NPM O5R O5S OK1 RPM 7X8 |
ID | FETCH-LOGICAL-p567-7c05916478a8f28dd2db4020765a3ec886c419649c539bca9ea92a274821decc3 |
ISSN | 2153-4063 |
IngestDate | Thu Jul 10 16:41:07 EDT 2025 Wed Jul 23 01:46:39 EDT 2025 |
IsPeerReviewed | false |
IsScholarly | true |
Keywords | GPT-4 Prompt engineering relation extraction GPT-3.5-turbo generative pre-trained transformer |
Language | English |
License | 2024 AMIA - All rights reserved. |
LinkModel | OpenURL |
MergedId | FETCHMERGED-LOGICAL-p567-7c05916478a8f28dd2db4020765a3ec886c419649c539bca9ea92a274821decc3 |
Notes | ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 23 |
PMID | 38827097 |
PQID | 3064139704 |
PQPubID | 23479 |
ParticipantIDs | proquest_miscellaneous_3064139704 pubmed_primary_38827097 |
PublicationCentury | 2000 |
PublicationDate | 2024-00-00 20240101 |
PublicationDateYYYYMMDD | 2024-01-01 |
PublicationDate_xml | – year: 2024 text: 2024-00-00 |
PublicationDecade | 2020 |
PublicationPlace | United States |
PublicationPlace_xml | – name: United States |
PublicationTitle | AMIA Summits on Translational Science proceedings |
PublicationTitleAlternate | AMIA Jt Summits Transl Sci Proc |
PublicationYear | 2024 |
SSID | ssj0000446954 |
Score | 2.243002 |
Snippet | Relation Extraction (RE) is a natural language processing (NLP) task for extracting semantic relations between biomedical entities. Recent developments in... |
SourceID | proquest pubmed |
SourceType | Aggregation Database Index Database |
StartPage | 391 |
Title | A Study of Biomedical Relation Extraction Using GPT Models |
URI | https://www.ncbi.nlm.nih.gov/pubmed/38827097 https://www.proquest.com/docview/3064139704 |
Volume | 2024 |
hasFullText | 1 |
inHoldings | 1 |
isFullTextHit | |
isPrint | |
link | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwnV3dS8MwEA-yJ19E8Wt-EcE36ehX2sS3KnNTmPgwYW8lTVMYzG1uKwz_ei9J0w5xoL6U0qShzY_c_e5yl0PoxuWgc1jkOSB_CycULnMySokjSRCxQnJChTIUBy9R_y18HpFRU1FPZ5esso74_DGv5D-owjPAVWXJ_gHZelB4APeAL1wBYbj-CuNEhwHqPfJ7nUavZ9zGt91216tFVQrcRAb0Xoe6-NlkuclJk8ETDASfPTZ7B1p9TayT0C7-RtPVLLx2NlfpYI0PJ6szgdbKOdi8MCu1rivH61I2UtmMMirlh1WklR_CJD53pJZUQBsCMEQrSVWJ1bqPkYyBKcq1gcr8XcMSAMePXROk--3oa9sEGhZ4lArY64282oOmtqIZUYWVbL_tVoJmC8N9tFfRfJwYzA7QjpweorsEa7zwrMANXtjihRu8sMYLA17Y4HWEho_d4UPfqWpXOHMCqicWQFvVUW2U08Knee7nmbLU44jwQApKIxGqo9CYIAHLBGeSM5_7cUh9L4dVFRyj1nQ2lacIs4JR4UrKXSLCKMiYJzKQmzknLOQ8Em10bX86BdGg9nv4VM7KZaqMS0Xw3bCNTsxspHNzhklqp-xsa8s52lUQGnfTBWqtFqW8BAK2yq40Dl97UzT_ |
linkProvider | Geneva Foundation for Medical Education and Research |
openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=A+Study+of+Biomedical+Relation+Extraction+Using+GPT+Models&rft.jtitle=AMIA+Summits+on+Translational+Science+proceedings&rft.au=Zhang%2C+Jeffrey&rft.au=Wibert%2C+Maxwell&rft.au=Zhou%2C+Huixue&rft.au=Peng%2C+Xueqing&rft.date=2024&rft.eissn=2153-4063&rft.volume=2024&rft.spage=391&rft_id=info%3Apmid%2F38827097&rft.externalDocID=38827097 |
thumbnail_l | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=2153-4063&client=summon |
thumbnail_m | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=2153-4063&client=summon |
thumbnail_s | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=2153-4063&client=summon |