Improving Exploration in Deep Reinforcement Learning for Incomplete Information Competition Environments

The sparse reward problem widely exists in multi-agent deep reinforcement learning, preventing agents from learning optimal actions and resulting in inefficient interactions with the environment. Many efforts have been made to design denser rewards and promote agent exploration. However, existing me...

Full description

Saved in:
Bibliographic Details
Published inIEEE transactions on emerging topics in computational intelligence pp. 1 - 12
Main Authors Lin, Jie, Ye, Yuhao, Li, Shaobo, Zhang, Hanlin, Zhao, Peng
Format Journal Article
LanguageEnglish
Published IEEE 2025
Subjects
Online AccessGet full text
ISSN2471-285X
2471-285X
DOI10.1109/TETCI.2025.3555250

Cover

Loading…
More Information
Summary:The sparse reward problem widely exists in multi-agent deep reinforcement learning, preventing agents from learning optimal actions and resulting in inefficient interactions with the environment. Many efforts have been made to design denser rewards and promote agent exploration. However, existing methods only focus on the breadth of action exploration, neglecting the rationality of action exploration in deep reinforcement learning, which leads to inefficient action exploration for agents. To address this issue, in this paper, we propose a novel curiosity-based action exploration method in incomplete information competition game environments, namely IGC, to improve both the breadth and rationality of action exploitation in multi-agent deep reinforcement learning for sparse-reward environments. Particularly, to enhance the capability of action exploration for agents, the distance reward is designed in our IGC method to increase the density of rewards in action exploration, thereby mitigating the sparse reward problem. In addition, by integrating the Intrinsic Curiosity Module (ICM) into DQN, we propose an enhanced ICM-DQN module, which enhances the breadth and rationality of subject action exploration for agents. By doing this, our IGC method can mitigate the randomness of the existing curiosity mechanism and increase the rationality of action exploration of agents, thereby enhancing the efficiency of action exploration. Finally, we evaluate the effectiveness of our IGC method on an incomplete information card game, namely Uno card game. The results demonstrate that our IGC method can achieve both better action exploration efficiency and greater winning-rate in comparison with existing methods.
ISSN:2471-285X
2471-285X
DOI:10.1109/TETCI.2025.3555250