Structure-Augment based Long-Tailed Knowledge Graph Completion Model

The application of knowledge graph completion in industry and the Internet of Things involves various aspects, ranging from improving production efficiency to providing intelligent decision support. During the pursuit of constructing a knowledge graph, information for the graph is obtained from text...

Full description

Saved in:
Bibliographic Details
Published in2024 27th International Conference on Computer Supported Cooperative Work in Design (CSCWD) pp. 3218 - 3223
Main Authors Wang, Jianrong, Tang, Yi, Hou, Dejun, Wang, Jinchi, Meng, Zechen, Xu, Tianyi
Format Conference Proceeding
LanguageEnglish
Published IEEE 08.05.2024
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:The application of knowledge graph completion in industry and the Internet of Things involves various aspects, ranging from improving production efficiency to providing intelligent decision support. During the pursuit of constructing a knowledge graph, information for the graph is obtained from textual documents and online web pages. The extraction of information from documents is incomplete due to the limited information, so the knowledge graph needs to be completed. The knowledge graph completion model maps triples into different vector spaces for representation, but it still has the following shortcomings: 1) the textual encoder lacks of structured knowledge, 2) the distribution of relationships is long-tailed, with the model predominantly predicting more frequent relationships. In this paper, we propose structure-augment based long-tailed knowledge graph completion model (SALT-KGC) to deal with the two issues. To tackle the first problem, we partition each triple into two asymmetric parts as in translation-based graph embedding approach. We encode both parts using a Siamese-style textual encoder. Our model employs both classifier and spatial measurement for representation and structure learning respectively, in order to increase the structured knowledge of encoders. To tackle the second problem, we implement the focal loss to solve the problem of imbalance between positive and negative samples and focus on hard examples. Moreover, we develop a self-adaptive ensemble scheme to further improve the performance by incorporating from an existing graph embedding model. The SALT-KGC model achieves state-of-the-art performance on two widely-used public datasets.
ISSN:2768-1904
DOI:10.1109/CSCWD61410.2024.10580212