Improving Exploration in Deep Reinforcement Learning for Incomplete Information Competition Environments

The sparse reward problem widely exists in multi-agent deep reinforcement learning, preventing agents from learning optimal actions and resulting in inefficient interactions with the environment. Many efforts have been made to design denser rewards and promote agent exploration. However, existing me...

Full description

Saved in:

Bibliographic Details
Published in	IEEE transactions on emerging topics in computational intelligence pp. 1 - 12
Main Authors	Lin, Jie, Ye, Yuhao, Li, Shaobo, Zhang, Hanlin, Zhao, Peng
Format	Journal Article
Language	English
Published	IEEE 2025
Subjects	Action exploration Computational intelligence Convergence Deep reinforcement learning Games incomplete information competition environments Neural networks Object recognition Q-learning Reviews reward design Silicon carbide sparse reward Training
Online Access	Get full text
ISSN	2471-285X 2471-285X
DOI	10.1109/TETCI.2025.3555250

Cover

Loading…

More Information
Summary:	The sparse reward problem widely exists in multi-agent deep reinforcement learning, preventing agents from learning optimal actions and resulting in inefficient interactions with the environment. Many efforts have been made to design denser rewards and promote agent exploration. However, existing methods only focus on the breadth of action exploration, neglecting the rationality of action exploration in deep reinforcement learning, which leads to inefficient action exploration for agents. To address this issue, in this paper, we propose a novel curiosity-based action exploration method in incomplete information competition game environments, namely IGC, to improve both the breadth and rationality of action exploitation in multi-agent deep reinforcement learning for sparse-reward environments. Particularly, to enhance the capability of action exploration for agents, the distance reward is designed in our IGC method to increase the density of rewards in action exploration, thereby mitigating the sparse reward problem. In addition, by integrating the Intrinsic Curiosity Module (ICM) into DQN, we propose an enhanced ICM-DQN module, which enhances the breadth and rationality of subject action exploration for agents. By doing this, our IGC method can mitigate the randomness of the existing curiosity mechanism and increase the rationality of action exploration of agents, thereby enhancing the efficiency of action exploration. Finally, we evaluate the effectiveness of our IGC method on an incomplete information card game, namely Uno card game. The results demonstrate that our IGC method can achieve both better action exploration efficiency and greater winning-rate in comparison with existing methods.
ISSN:	2471-285X 2471-285X
DOI:	10.1109/TETCI.2025.3555250