Sentence-level Distant Supervision Relation Extraction based on Dynamic Soft Labels
Distant supervision is widely used in relation extraction because it can automatically annotate data based on existing Knowledge Graph and corpus. Inevitably, it also results in noisy labels problem. In order to address the problem, the usual method is to put all sentences with the same entity pair...
Saved in:
Published in | 2024 27th International Conference on Computer Supported Cooperative Work in Design (CSCWD) pp. 3194 - 3199 |
---|---|
Main Authors | , , , , , |
Format | Conference Proceeding |
Language | English |
Published |
IEEE
08.05.2024
|
Subjects | |
Online Access | Get full text |
Cover
Loading…
Summary: | Distant supervision is widely used in relation extraction because it can automatically annotate data based on existing Knowledge Graph and corpus. Inevitably, it also results in noisy labels problem. In order to address the problem, the usual method is to put all sentences with the same entity pair in a bag, set bag-level label for them, and perform relation prediction on bag-level. However, in some downstream tasks such as question answering and semantic parsing, accurate sentence-level prediction is more important. So in this paper, we conduct study on the sentence-level and propose a novel and efficient sentence-level distant supervision relation extraction framework, SEDSL. Specifically, we adopt soft labels that can be dynamically updated during training phase to provide more accurate supervision signals to alleviate the influence of noisy labels and propose a tighter noise-filtering and re-labeling strategy to identify noisy instances and re-label them. Moreover, SEDSL is independent of the backbone network structure, so it is more general and can be applied to various sentence encoders. Extensive experimental results on NYT-10 dataset show the significant improvement of the proposed framework over all baseline methods on sentence-level relation extraction and noise reduction effect. |
---|---|
ISSN: | 2768-1904 |
DOI: | 10.1109/CSCWD61410.2024.10580472 |