LPIC: Learnable Prompts and ID-guided Contrastive Learning for Multimodal Recommendation
Multimodal recommendation systems improve the accuracy of recommendations by integrating information from different modalities to obtain potential representations of users and items. However, existing multimodal recommendation methods often use single user embedding to model users’ interests in diff...
Saved in:
Published in | ACM transactions on multimedia computing communications and applications |
---|---|
Main Authors | , , , , |
Format | Journal Article |
Language | English |
Published |
23.05.2025
|
Online Access | Get full text |
ISSN | 1551-6857 1551-6865 |
DOI | 10.1145/3735561 |
Cover
Summary: | Multimodal recommendation systems improve the accuracy of recommendations by integrating information from different modalities to obtain potential representations of users and items. However, existing multimodal recommendation methods often use single user embedding to model users’ interests in different modalities, neglecting multimodal information. Furthermore, the semantics expressed by the same items in different modalities may be inconsistent, leading to suboptimal recommendation performance. To alleviate the impact of these issues, we propose a new multimodal recommendation framework called Learnable Prompts and ID-guided Contrastive Learning (LPIC). Specifically, we introduce a continuously learnable prompt embedding method, incorporating multimodal features of items to model users’ interests in specific modalities. Then, we propose an ID-guided contrastive learning component to enhance historical interaction features in textual, visual, and fused modalities, while aligning text, image, and fused modality to enhance semantic consistency between modalities. Finally, we conduct extensive experiments on three publicly available Amazon datasets to demonstrate the effectiveness of the LPIC framework. |
---|---|
ISSN: | 1551-6857 1551-6865 |
DOI: | 10.1145/3735561 |