A Study of Biomedical Relation Extraction Using GPT Models

Relation Extraction (RE) is a natural language processing (NLP) task for extracting semantic relations between biomedical entities. Recent developments in pre-trained large language models (LLM) motivated NLP researchers to use them for various NLP tasks. We investigated GPT-3.5-turbo and GPT-4 on e...

Full description

Saved in:

Bibliographic Details
Published in	AMIA Summits on Translational Science proceedings Vol. 2024; p. 391
Main Authors	Zhang, Jeffrey, Wibert, Maxwell, Zhou, Huixue, Peng, Xueqing, Chen, Qingyu, Keloth, Vipina K, Hu, Yan, Zhang, Rui, Xu, Hua, Raja, Kalpana
Format	Journal Article
Language	English
Published	United States 2024
Subjects	GPT-4 Prompt engineering relation extraction GPT-3.5-turbo generative pre-trained transformer
Online Access	Get full text
ISSN	2153-4063 2153-4063

Cover

More Information
Summary:	Relation Extraction (RE) is a natural language processing (NLP) task for extracting semantic relations between biomedical entities. Recent developments in pre-trained large language models (LLM) motivated NLP researchers to use them for various NLP tasks. We investigated GPT-3.5-turbo and GPT-4 on extracting the relations from three standard datasets, EU-ADR, Gene Associations Database (GAD), and ChemProt. Unlike the existing approaches using datasets with masked entities, we used three versions for each dataset for our experiment: a version with masked entities, a second version with the original entities (unmasked), and a third version with abbreviations replaced with the original terms. We developed the prompts for various versions and used the chat completion model from GPT API. Our approach achieved a F1-score of 0.498 to 0.809 for GPT-3.5-turbo, and a highest F1-score of 0.84 for GPT-4. For certain experiments, the performance of GPT, BioBERT, and PubMedBERT are almost the same.
Bibliography:	ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 23
ISSN:	2153-4063 2153-4063