RDF-star2Vec: RDF-star Graph Embeddings for Data Mining

Knowledge Graphs (KGs) such as Resource Description Framework (RDF) data represent relationships between various entities through the structure of triples (<inline-formula> <tex-math notation="LaTeX"> < subject </tex-math></inline-formula>, <inline-formula>...

Full description

Saved in:
Bibliographic Details
Published inIEEE access Vol. 11; pp. 142030 - 142042
Main Authors Egami, Shusaku, Ugai, Takanori, Oota, Masateru, Matsushita, Kyoumoto, Kawamura, Takahiro, Kozaki, Kouji, Fukuda, Ken
Format Journal Article
LanguageEnglish
Published Piscataway IEEE 2023
The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:Knowledge Graphs (KGs) such as Resource Description Framework (RDF) data represent relationships between various entities through the structure of triples (<inline-formula> <tex-math notation="LaTeX"> < subject </tex-math></inline-formula>, <inline-formula> <tex-math notation="LaTeX">predicate </tex-math></inline-formula>, <inline-formula> <tex-math notation="LaTeX">object> </tex-math></inline-formula>). Knowledge graph embedding (KGE) is crucial in machine learning applications, specifically in node classification and link prediction tasks. KGE remains a vital research topic within the semantic web community. RDF-star introduces the concept of a quoted triple (QT), a specific form of triple employed either as the subject or object within another triple. Moreover, RDF-star permits a QT to act as compositional entities within another QT, thereby enabling the representation of recursive, hyper-relational KGs with nested structures. However, existing KGE models fail to adequately learn the semantics of QTs and entities, primarily because they do not account for RDF-star graphs containing multi-leveled nested QTs and QT-QT relationships. This study introduces RDF-star2Vec, a novel KGE model specifically designed for RDF-star graphs. RDF-star2Vec introduces graph walk techniques that enable probabilistic transitions between a QT and its compositional entities. Feature vectors for QTs, entities, and relations are derived from generated sequences through the structured skip-gram model. Additionally, we provide a dataset and a benchmarking framework for data mining tasks focused on complex RDF-star graphs. Evaluative experiments demonstrated that RDF-star2Vec yielded superior performance compared to recent extensions of RDF2Vec in various tasks including classification, clustering, entity relatedness, and QT similarity.
ISSN:2169-3536
2169-3536
DOI:10.1109/ACCESS.2023.3341029