LPIC: Learnable Prompts and ID-guided Contrastive Learning for Multimodal Recommendation

Multimodal recommendation systems improve the accuracy of recommendations by integrating information from different modalities to obtain potential representations of users and items. However, existing multimodal recommendation methods often use single user embedding to model users’ interests in diff...

Full description

Saved in:
Bibliographic Details
Published inACM transactions on multimedia computing communications and applications
Main Authors Liu, Xin, Song, Qiya, Xiao, Lin, Wang, Chun, Gao, Xieping
Format Journal Article
LanguageEnglish
Published 23.05.2025
Online AccessGet full text
ISSN1551-6857
1551-6865
DOI10.1145/3735561

Cover

More Information
Summary:Multimodal recommendation systems improve the accuracy of recommendations by integrating information from different modalities to obtain potential representations of users and items. However, existing multimodal recommendation methods often use single user embedding to model users’ interests in different modalities, neglecting multimodal information. Furthermore, the semantics expressed by the same items in different modalities may be inconsistent, leading to suboptimal recommendation performance. To alleviate the impact of these issues, we propose a new multimodal recommendation framework called Learnable Prompts and ID-guided Contrastive Learning (LPIC). Specifically, we introduce a continuously learnable prompt embedding method, incorporating multimodal features of items to model users’ interests in specific modalities. Then, we propose an ID-guided contrastive learning component to enhance historical interaction features in textual, visual, and fused modalities, while aligning text, image, and fused modality to enhance semantic consistency between modalities. Finally, we conduct extensive experiments on three publicly available Amazon datasets to demonstrate the effectiveness of the LPIC framework.
ISSN:1551-6857
1551-6865
DOI:10.1145/3735561