LPIC: Learnable Prompts and ID-guided Contrastive Learning for Multimodal Recommendation

Multimodal recommendation systems improve the accuracy of recommendations by integrating information from different modalities to obtain potential representations of users and items. However, existing multimodal recommendation methods often use single user embedding to model users’ interests in diff...

Full description

Saved in:

Bibliographic Details
Published in	ACM transactions on multimedia computing communications and applications
Main Authors	Liu, Xin, Song, Qiya, Xiao, Lin, Wang, Chun, Gao, Xieping
Format	Journal Article
Language	English
Published	23.05.2025
Online Access	Get full text
ISSN	1551-6857 1551-6865
DOI	10.1145/3735561

Cover

More Information
Summary:	Multimodal recommendation systems improve the accuracy of recommendations by integrating information from different modalities to obtain potential representations of users and items. However, existing multimodal recommendation methods often use single user embedding to model users’ interests in different modalities, neglecting multimodal information. Furthermore, the semantics expressed by the same items in different modalities may be inconsistent, leading to suboptimal recommendation performance. To alleviate the impact of these issues, we propose a new multimodal recommendation framework called Learnable Prompts and ID-guided Contrastive Learning (LPIC). Specifically, we introduce a continuously learnable prompt embedding method, incorporating multimodal features of items to model users’ interests in specific modalities. Then, we propose an ID-guided contrastive learning component to enhance historical interaction features in textual, visual, and fused modalities, while aligning text, image, and fused modality to enhance semantic consistency between modalities. Finally, we conduct extensive experiments on three publicly available Amazon datasets to demonstrate the effectiveness of the LPIC framework.
ISSN:	1551-6857 1551-6865
DOI:	10.1145/3735561