Modified R-BERT with global semantic information for relation classification task

The objective of the relation classification task is to extract relations between entities. Recent studies have found that R-BERT (Wu and He, 2019) based on pre-trained BERT (Devlin et al., 2019) acquires extremely good results in the relation classification task. However, this method does not take...

Full description

Saved in:
Bibliographic Details
Published inComputer speech & language Vol. 89; p. 101686
Main Authors Wang, Yuhua, Hu, Junying, Su, Yongli, Zhang, Bo, Sun, Kai, Zhang, Hai
Format Journal Article
LanguageEnglish
Published Elsevier Ltd 01.01.2025
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:The objective of the relation classification task is to extract relations between entities. Recent studies have found that R-BERT (Wu and He, 2019) based on pre-trained BERT (Devlin et al., 2019) acquires extremely good results in the relation classification task. However, this method does not take into account the semantic differences between different kinds of entities and global semantic information either. In this paper, we set two different fully connected layers to take into account the semantic difference between subject and object entities. Besides, we build a new module named Concat Module to fully fuse the semantic information among the subject entity vector, object entity vector, and the whole sample sentence representation vector. In addition, we apply the average pooling to acquire a better entity representation of each entity and add the activation operation with a new fully connected layer after our Concat Module. Modifying R-BERT, we propose a new model named BERT with Global Semantic Information (GSR-BERT) for relation classification tasks. We use our approach on two datasets: the SemEval-2010 Task 8 dataset and the Chinese character relationship classification dataset. Our approach achieves a significant improvement over the two datasets. It means that our approach enjoys transferability across different datasets. Furthermore, we prove that these policies we used in our approach also enjoy applicability to named entity recognition task. •We take into account the semantic difference between subject and object entities.•We fully fuse the semantic information between entities and the whole sentence.•These policies we used enjoy high applicability to different NLP tasks.
ISSN:0885-2308
DOI:10.1016/j.csl.2024.101686