AM-Bi-LSTM: Adaptive Multi-Modal Bi-LSTM for Sequential Recommendation

Conventional methods for the early fusion of multi-modal features cannot recognize the relevant modality corresponding to the demand of each user in sequential recommendation. In this paper, we propose the adaptive multi-modal bidirectional long short-term memory network (AM-Bi-LSTM) to recognize th...

Full description

Saved in:

Bibliographic Details
Published in	IEEE access Vol. 12; pp. 12720 - 12733
Main Authors	Ohtomo, Kazuma, Harakawa, Ryosuke, Iisaka, Masaki, Iwahashi, Masahiro
Format	Journal Article
Language	English
Published	Piscataway The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2024 IEEE
Subjects	deep learning Feature recognition multi-modal processing Recommender systems Recurrent neural networks Sequential recommendation Web services
Online Access	Get full text

Cover

Loading…

More Information
Summary:	Conventional methods for the early fusion of multi-modal features cannot recognize the relevant modality corresponding to the demand of each user in sequential recommendation. In this paper, we propose the adaptive multi-modal bidirectional long short-term memory network (AM-Bi-LSTM) to recognize the relevant modality for sequential recommendation. Specifically, we construct a new recurrent neural network model that is based on the bidirectional long short-term memory network and obtains multi-modal features, including each user’s sequential actions. Our new modality attention module calculates the importance degree of multi-modal features for sequential operations via the late-fusion approach, which results in the method recognizing the relevant modality. In experiments on a multi-modal and sequential dataset including 14,941 clicks constructed from the largest Web service for teachers in Japan, we demonstrate that AM-Bi-LSTM outperforms existing methods in terms of the diversity, explainability, and accuracy of recommendation. Specifically, we obtain Recall@10 that is 0.1005 better than that of existing early-fusion methods. Moreover, we obtain a value of catalog coverage@10 (representing diversity) that is 0.1710 higher than that for existing methods.
ISSN:	2169-3536 2169-3536
DOI:	10.1109/ACCESS.2024.3355548