Sentence-level Distant Supervision Relation Extraction based on Dynamic Soft Labels

Distant supervision is widely used in relation extraction because it can automatically annotate data based on existing Knowledge Graph and corpus. Inevitably, it also results in noisy labels problem. In order to address the problem, the usual method is to put all sentences with the same entity pair...

Full description

Saved in:
Bibliographic Details
Published in2024 27th International Conference on Computer Supported Cooperative Work in Design (CSCWD) pp. 3194 - 3199
Main Authors Hou, Dejun, Zhang, Zefeng, Zhao, Mankun, Zhang, Wenbin, Zhao, Yue, Yu, Jian
Format Conference Proceeding
LanguageEnglish
Published IEEE 08.05.2024
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:Distant supervision is widely used in relation extraction because it can automatically annotate data based on existing Knowledge Graph and corpus. Inevitably, it also results in noisy labels problem. In order to address the problem, the usual method is to put all sentences with the same entity pair in a bag, set bag-level label for them, and perform relation prediction on bag-level. However, in some downstream tasks such as question answering and semantic parsing, accurate sentence-level prediction is more important. So in this paper, we conduct study on the sentence-level and propose a novel and efficient sentence-level distant supervision relation extraction framework, SEDSL. Specifically, we adopt soft labels that can be dynamically updated during training phase to provide more accurate supervision signals to alleviate the influence of noisy labels and propose a tighter noise-filtering and re-labeling strategy to identify noisy instances and re-label them. Moreover, SEDSL is independent of the backbone network structure, so it is more general and can be applied to various sentence encoders. Extensive experimental results on NYT-10 dataset show the significant improvement of the proposed framework over all baseline methods on sentence-level relation extraction and noise reduction effect.
ISSN:2768-1904
DOI:10.1109/CSCWD61410.2024.10580472