NEDORT: a novel and efficient approach to the data overlap problem in relational triples

Relation triple extraction is a combination of named entity recognition and relation prediction. Early works ignore the problem of data overlap when extracting triples, resulting in poor extraction performance. Subsequent works improve the capability of the model to extract overlapping triples throu...

Full description

Saved in:
Bibliographic Details
Published inComplex & intelligent systems Vol. 9; no. 5; pp. 5235 - 5250
Main Authors Zhang, Zhanjun, Hu, Xiaoru, Zhang, Haoyu, Liu, Jie
Format Journal Article
LanguageEnglish
Published Cham Springer International Publishing 01.10.2023
Springer Nature B.V
Springer
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:Relation triple extraction is a combination of named entity recognition and relation prediction. Early works ignore the problem of data overlap when extracting triples, resulting in poor extraction performance. Subsequent works improve the capability of the model to extract overlapping triples through generative and extractive methods. These works achieve considerable performance but still suffer from some defects, such as poor extraction capability for individual triplets and inappropriate spatial distribution of the data. To solve the above problems, we perform sequence-to-matrix transformation and propose the NEDORT model. NEDORT predicts all subjects in the sentence and then completes the extraction of relation–object pairs. There are overlapping parts between relation–object pairs, so we conduct the conversion of sequence to matrix. We design the Differential Amplified Multi-head Attention method to extract subjects. This method highlights the locations of entities and captures sequence features from multiple dimensions. When performing the extraction of relation–object pairs, we fuse subject and sequence information through the Biaffine method and generate relation–sequence matrices. In addition, we design a multi-layer U-Net network to optimize the matrix representation and improve the extraction performance of the model. Experimental results on two public datasets show that our model outperforms other baseline models on triples of all categories
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
ISSN:2199-4536
2198-6053
DOI:10.1007/s40747-023-01004-8