Improving Exploration in Deep Reinforcement Learning for Incomplete Information Competition Environments

The sparse reward problem widely exists in multi-agent deep reinforcement learning, preventing agents from learning optimal actions and resulting in inefficient interactions with the environment. Many efforts have been made to design denser rewards and promote agent exploration. However, existing me...

Full description

Saved in:

Bibliographic Details
Published in	IEEE transactions on emerging topics in computational intelligence pp. 1 - 12
Main Authors	Lin, Jie, Ye, Yuhao, Li, Shaobo, Zhang, Hanlin, Zhao, Peng
Format	Journal Article
Language	English
Published	IEEE 2025
Subjects	Action exploration Computational intelligence Convergence Deep reinforcement learning Games incomplete information competition environments Neural networks Object recognition Q-learning Reviews reward design Silicon carbide sparse reward Training
Online Access	Get full text
ISSN	2471-285X 2471-285X
DOI	10.1109/TETCI.2025.3555250

Cover

Abstract	The sparse reward problem widely exists in multi-agent deep reinforcement learning, preventing agents from learning optimal actions and resulting in inefficient interactions with the environment. Many efforts have been made to design denser rewards and promote agent exploration. However, existing methods only focus on the breadth of action exploration, neglecting the rationality of action exploration in deep reinforcement learning, which leads to inefficient action exploration for agents. To address this issue, in this paper, we propose a novel curiosity-based action exploration method in incomplete information competition game environments, namely IGC, to improve both the breadth and rationality of action exploitation in multi-agent deep reinforcement learning for sparse-reward environments. Particularly, to enhance the capability of action exploration for agents, the distance reward is designed in our IGC method to increase the density of rewards in action exploration, thereby mitigating the sparse reward problem. In addition, by integrating the Intrinsic Curiosity Module (ICM) into DQN, we propose an enhanced ICM-DQN module, which enhances the breadth and rationality of subject action exploration for agents. By doing this, our IGC method can mitigate the randomness of the existing curiosity mechanism and increase the rationality of action exploration of agents, thereby enhancing the efficiency of action exploration. Finally, we evaluate the effectiveness of our IGC method on an incomplete information card game, namely Uno card game. The results demonstrate that our IGC method can achieve both better action exploration efficiency and greater winning-rate in comparison with existing methods.
AbstractList	The sparse reward problem widely exists in multi-agent deep reinforcement learning, preventing agents from learning optimal actions and resulting in inefficient interactions with the environment. Many efforts have been made to design denser rewards and promote agent exploration. However, existing methods only focus on the breadth of action exploration, neglecting the rationality of action exploration in deep reinforcement learning, which leads to inefficient action exploration for agents. To address this issue, in this paper, we propose a novel curiosity-based action exploration method in incomplete information competition game environments, namely IGC, to improve both the breadth and rationality of action exploitation in multi-agent deep reinforcement learning for sparse-reward environments. Particularly, to enhance the capability of action exploration for agents, the distance reward is designed in our IGC method to increase the density of rewards in action exploration, thereby mitigating the sparse reward problem. In addition, by integrating the Intrinsic Curiosity Module (ICM) into DQN, we propose an enhanced ICM-DQN module, which enhances the breadth and rationality of subject action exploration for agents. By doing this, our IGC method can mitigate the randomness of the existing curiosity mechanism and increase the rationality of action exploration of agents, thereby enhancing the efficiency of action exploration. Finally, we evaluate the effectiveness of our IGC method on an incomplete information card game, namely Uno card game. The results demonstrate that our IGC method can achieve both better action exploration efficiency and greater winning-rate in comparison with existing methods.
Author	Ye, Yuhao Zhang, Hanlin Lin, Jie Li, Shaobo Zhao, Peng
Author_xml	– sequence: 1 givenname: Jie orcidid: 0000-0003-3476-110X surname: Lin fullname: Lin, Jie email: jielin@mail.xjtu.edu.cn organization: Xi'an Jiaotong University, Xi'an, China – sequence: 2 givenname: Yuhao surname: Ye fullname: Ye, Yuhao email: yeyuhao0607@stu.xjtu.edu.cn organization: Xi'an Jiaotong University, Xi'an, China – sequence: 3 givenname: Shaobo orcidid: 0009-0003-8470-010X surname: Li fullname: Li, Shaobo email: 4122151031@stu.xjtu.edu.cn organization: Xi'an Jiaotong University, Xi'an, China – sequence: 4 givenname: Hanlin orcidid: 0000-0001-8869-6863 surname: Zhang fullname: Zhang, Hanlin email: hanlin@qdu.edu.cn organization: Qingdao University, Qingdao, China – sequence: 5 givenname: Peng orcidid: 0000-0001-7033-9315 surname: Zhao fullname: Zhao, Peng email: p.zhao@mail.xjtu.edu.cn organization: Xi'an Jiaotong University, Xi'an, China
BookMark	eNpNkM1Kw0AUhQepYK19AXGRF0idv5tMllJrDQQEqeAuTMIdHWlmwiQUfXsnbRdd3cPhfJfDuSUz5x0Scs_oijFaPO42u3W54pTDSgAAB3pF5lzmLOUKPmcX-oYsh-GHUsoLYALknHyXXR_8wbqvZPPb733Qo_UusS55RuyTd7TO-NBih25MKtTBTdFoJaVrfdfvccQoo9GdyHU0cbRHvXEHG7yb2OGOXBu9H3B5vgvy8RJ7v6bV27ZcP1Vpyzkb04KDMY3h2uiM80wVUjfYSFlQJTQ0SsbqBlAICbkwjciNyJECUyZGm1yJBeGnv23wwxDQ1H2wnQ5_NaP1NFd9nKue5qrPc0Xo4QRZRLwAikxmKhf_vHxquw
CODEN	ITETCU
ContentType	Journal Article
DBID	97E RIA RIE AAYXX CITATION
DOI	10.1109/TETCI.2025.3555250
DatabaseName	IEEE All-Society Periodicals Package (ASPP) 2005–Present IEEE All-Society Periodicals Package (ASPP) 1998–Present IEEE Xplore CrossRef
DatabaseTitle	CrossRef
DatabaseTitleList
Database_xml	– sequence: 1 dbid: RIE name: IEEE/IET Electronic Library (IEL) (UW System Shared) url: https://proxy.k.utb.cz/login?url=https://ieeexplore.ieee.org/ sourceTypes: Publisher
DeliveryMethod	fulltext_linktorsrc
EISSN	2471-285X
EndPage	12
ExternalDocumentID	10_1109_TETCI_2025_3555250 10964687
Genre	orig-research
GrantInformation_xml	– fundername: Open-End Foundation of National Key Laboratory of Air-based Information Perception and Fusion 6A – fundername: Aviation Foundation of National Key Laboratory of Air-based Information Perception and Fusion grantid: ASFC-20240001070002
GroupedDBID	0R~ 97E AAJGR AASAJ AAWTH ABAZT ABJNI ABQJQ ABVLG ACGFS AGQYO AHBIQ AKJIK AKQYR ALMA_UNASSIGNED_HOLDINGS ATWAV BEFXN BFFAM BGNUA BKEBE BPEOZ EBS IFIPE JAVBF OCL RIA RIE AAYXX CITATION EJD RIG
ID	FETCH-LOGICAL-c221t-925ffbf2afa6226894abeb449083a5b84029f5e334573fb37f37e0518f894b783
IEDL.DBID	RIE
ISSN	2471-285X
IngestDate	Tue Aug 05 12:09:21 EDT 2025 Wed Aug 27 02:03:36 EDT 2025
IsPeerReviewed	true
IsScholarly	true
Language	English
License	https://ieeexplore.ieee.org/Xplorehelp/downloads/license-information/IEEE.html https://doi.org/10.15223/policy-029 https://doi.org/10.15223/policy-037
LinkModel	DirectLink
MergedId	FETCHMERGED-LOGICAL-c221t-925ffbf2afa6226894abeb449083a5b84029f5e334573fb37f37e0518f894b783
ORCID	0009-0003-8470-010X 0000-0001-8869-6863 0000-0003-3476-110X 0000-0001-7033-9315
PageCount	12
ParticipantIDs	crossref_primary_10_1109_TETCI_2025_3555250 ieee_primary_10964687
PublicationCentury	2000
PublicationDate	2025-00-00
PublicationDateYYYYMMDD	2025-01-01
PublicationDate_xml	– year: 2025 text: 2025-00-00
PublicationDecade	2020
PublicationTitle	IEEE transactions on emerging topics in computational intelligence
PublicationTitleAbbrev	TETCI
PublicationYear	2025
Publisher	IEEE
Publisher_xml	– name: IEEE
SSID	ssj0002951354
Score	2.2783546
Snippet	The sparse reward problem widely exists in multi-agent deep reinforcement learning, preventing agents from learning optimal actions and resulting in...
SourceID	crossref ieee
SourceType	Index Database Publisher
StartPage	1
SubjectTerms	Action exploration Computational intelligence Convergence Deep reinforcement learning Games incomplete information competition environments Neural networks Object recognition Q-learning Reviews reward design Silicon carbide sparse reward Training
Title	Improving Exploration in Deep Reinforcement Learning for Incomplete Information Competition Environments
URI	https://ieeexplore.ieee.org/document/10964687
hasFullText	1
inHoldings	1
isFullTextHit
isPrint
link	http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV1LSwMxEA62Jy8-sGJ9kYM32XU3j30cpbZUwR6khd6WnXSiImyLbC_-eifZbamC4C2EBMJMMvlmMt-EsRtFfiyZOQwIDcuArB8EIGIbgEYylJHUZuH4zs-TZDxTT3M9b8nqnguDiD75DEPX9G_5i6VZu1AZnfA8UUmWdliH9llD1toGVARhBanVhhgT5XfT4XTwSC6g0CHdqu797sfls_Obir9MRodssllGk0PyEa5rCM3XrwqN_17nETtoYSW_b_bBMdvD6oS9bSMGvEm181rg7xV_QFzxF_RVU40PEPK20Oorpy5OVsOlmhOg5i1fyc8ceJDtk7z4cIch12OzEUlhHLQ_KwRGiLgOcqGtBStKWyaEv7JclYCg3COgLDWQ0ydyq1FKpVNpQaZWpkjHN7M0FNJMnrJutazwjHEVL4w0SoFVqEqtybrGqCMwJUiIS-iz243Ii1VTQKPwjkeUF15BhVNQ0Sqoz3pOnDsjG0me_9F_wfbd9CYmcsm69ecarwgl1HDtd8c3aDO8Xw
linkProvider	IEEE
linkToHtml	http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV1LS8NAEF60HvTiAyvW5x68SWKzjzyOUi2ttj1IC72FTDqrIqRF0ou_3tlNWqogeAtLEpaZ3ZlvZuebZexGURxLZg49QsPSI-sHHojAeKCRDGVb6nxm-c7DUdibqKepntZkdceFQURXfIa-fXRn-bN5vrSpMtrhSajCONpmO-T4la7oWuuUiiC0ILVaUWPayd34cdzpUxAotE9-1Z7g_XA_G_epOHfSPWCj1USqKpIPf1mCn3_96tH475kesv0aWPL7aiUcsS0sjtnbOmfAq2I7pwf-XvAHxAV_Qdc3NXcpQl63Wn3lNMTJbthic4LUvGYsuS87Dma7Mi_-uMGRa7JJl6TQ8-q7FbxciKD0EqGNASMyk4WEwOJEZYCg7DGgzDRQ2CcSo1FKpSNpQEZGRkgbODb0KkSxPGGNYl7gKeMqmOUyVwqMQpVpTfY1QN2GPAMJQQYtdrsSebqoWmikLvRoJ6lTUGoVlNYKarGmFefGm5Ukz_4Yv2a7vfFwkA76o-dztmd_VWVILlij_FziJWGGEq7cSvkGb_S_rA
openUrl	ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Improving+Exploration+in+Deep+Reinforcement+Learning+for+Incomplete+Information+Competition+Environments&rft.jtitle=IEEE+transactions+on+emerging+topics+in+computational+intelligence&rft.au=Lin%2C+Jie&rft.au=Ye%2C+Yuhao&rft.au=Li%2C+Shaobo&rft.au=Zhang%2C+Hanlin&rft.date=2025&rft.pub=IEEE&rft.eissn=2471-285X&rft.spage=1&rft.epage=12&rft_id=info:doi/10.1109%2FTETCI.2025.3555250&rft.externalDocID=10964687
thumbnail_l	http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=2471-285X&client=summon
thumbnail_m	http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=2471-285X&client=summon
thumbnail_s	http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=2471-285X&client=summon