A Study of Biomedical Relation Extraction Using GPT Models

Relation Extraction (RE) is a natural language processing (NLP) task for extracting semantic relations between biomedical entities. Recent developments in pre-trained large language models (LLM) motivated NLP researchers to use them for various NLP tasks. We investigated GPT-3.5-turbo and GPT-4 on e...

Full description

Saved in:
Bibliographic Details
Published inAMIA Summits on Translational Science proceedings Vol. 2024; p. 391
Main Authors Zhang, Jeffrey, Wibert, Maxwell, Zhou, Huixue, Peng, Xueqing, Chen, Qingyu, Keloth, Vipina K, Hu, Yan, Zhang, Rui, Xu, Hua, Raja, Kalpana
Format Journal Article
LanguageEnglish
Published United States 2024
Subjects
Online AccessGet full text
ISSN2153-4063
2153-4063

Cover

More Information
Summary:Relation Extraction (RE) is a natural language processing (NLP) task for extracting semantic relations between biomedical entities. Recent developments in pre-trained large language models (LLM) motivated NLP researchers to use them for various NLP tasks. We investigated GPT-3.5-turbo and GPT-4 on extracting the relations from three standard datasets, EU-ADR, Gene Associations Database (GAD), and ChemProt. Unlike the existing approaches using datasets with masked entities, we used three versions for each dataset for our experiment: a version with masked entities, a second version with the original entities (unmasked), and a third version with abbreviations replaced with the original terms. We developed the prompts for various versions and used the chat completion model from GPT API. Our approach achieved a F1-score of 0.498 to 0.809 for GPT-3.5-turbo, and a highest F1-score of 0.84 for GPT-4. For certain experiments, the performance of GPT, BioBERT, and PubMedBERT are almost the same.
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 23
ISSN:2153-4063
2153-4063