Deep Reinforcement Learning Recommendation System Algorithm Based on Multi-Level Attention Mechanisms

Traditional recommendation systems, which rely on static user profiles and historical interaction data, frequently face difficulties in adapting to the rapid changes in user preferences that are typical of dynamic environments. In contrast, recommendation algorithms based on deep reinforcement learn...

Full description

Saved in:
Bibliographic Details
Published inElectronics (Basel) Vol. 13; no. 23; p. 4625
Main Authors Wang, Gaopeng, Ding, Jingyi, Hu, Fanlin
Format Journal Article
LanguageEnglish
Published Basel MDPI AG 01.12.2024
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:Traditional recommendation systems, which rely on static user profiles and historical interaction data, frequently face difficulties in adapting to the rapid changes in user preferences that are typical of dynamic environments. In contrast, recommendation algorithms based on deep reinforcement learning are capable of dynamically adjusting their strategies to accommodate real-time fluctuations in user preferences. However, current deep reinforcement learning recommendation algorithms encounter several challenges, including the oversight of item features associated with high long-term rewards that reflect users’ enduring interests, as well as a lack of significant relevance between user attributes and item characteristics. This leads to an inadequate extraction of personalized information. To address these issues, this study presents a novel recommendation system known as the Multi-Level Hierarchical Attention Mechanism Deep Reinforcement Recommendation (MHDRR), which is fundamentally grounded in a multi-layer attention mechanism. This mechanism consists of a local attention layer, a global attention layer, and a Transformer layer, allowing for a detailed analysis of individual attributes and interactions within short-term preferred items, while also exploring users’ long-term interests. This methodology promotes a comprehensive understanding of users’ immediate and enduring preferences, thereby improving the overall effectiveness of the system over time. Experimental results obtained from three publicly available datasets validate the effectiveness of the proposed model.
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
ISSN:2079-9292
2079-9292
DOI:10.3390/electronics13234625