MM-Transformer: A Transformer-Based Knowledge Graph Link Prediction Model That Fuses Multimodal Features

Multimodal knowledge graph completion necessitates the integration of information from multiple modalities (such as images and text) into the structural representation of entities to improve link prediction. However, most existing studies have overlooked the interaction between different modalities...

Full description

Saved in:
Bibliographic Details
Published inSymmetry (Basel) Vol. 16; no. 8; p. 961
Main Authors Wang, Dongsheng, Tang, Kangjie, Zeng, Jun, Pan, Yue, Dai, Yun, Li, Huige, Han, Bin
Format Journal Article
LanguageEnglish
Published Basel MDPI AG 01.08.2024
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:Multimodal knowledge graph completion necessitates the integration of information from multiple modalities (such as images and text) into the structural representation of entities to improve link prediction. However, most existing studies have overlooked the interaction between different modalities and the symmetry in the modal fusion process. To address this issue, this paper proposed a Transformer-based knowledge graph link prediction model (MM-Transformer) that fuses multimodal features. Different modal encoders are employed to extract structural, visual, and textual features, and symmetrical hybrid key-value calculations are performed on features from different modalities based on the Transformer architecture. The similarities of textual tags to structural tags and visual tags are calculated and aggregated, respectively, and multimodal entity representations are modeled and optimized to reduce the heterogeneity of the representations. The experimental results show that compared with the current multimodal SOTA method, MKGformer, MM-Transformer improves the Hits@1 and Hits@10 evaluation indicators by 1.17% and 1.39%, respectively, proving that the proposed method can effectively solve the problem of multimodal feature fusion in the knowledge graph link prediction task.
ISSN:2073-8994
2073-8994
DOI:10.3390/sym16080961