Improving Exploration in Deep Reinforcement Learning for Incomplete Information Competition Environments

The sparse reward problem widely exists in multi-agent deep reinforcement learning, preventing agents from learning optimal actions and resulting in inefficient interactions with the environment. Many efforts have been made to design denser rewards and promote agent exploration. However, existing me...

Full description

Saved in:
Bibliographic Details
Published inIEEE transactions on emerging topics in computational intelligence pp. 1 - 12
Main Authors Lin, Jie, Ye, Yuhao, Li, Shaobo, Zhang, Hanlin, Zhao, Peng
Format Journal Article
LanguageEnglish
Published IEEE 2025
Subjects
Online AccessGet full text
ISSN2471-285X
2471-285X
DOI10.1109/TETCI.2025.3555250

Cover

Abstract The sparse reward problem widely exists in multi-agent deep reinforcement learning, preventing agents from learning optimal actions and resulting in inefficient interactions with the environment. Many efforts have been made to design denser rewards and promote agent exploration. However, existing methods only focus on the breadth of action exploration, neglecting the rationality of action exploration in deep reinforcement learning, which leads to inefficient action exploration for agents. To address this issue, in this paper, we propose a novel curiosity-based action exploration method in incomplete information competition game environments, namely IGC, to improve both the breadth and rationality of action exploitation in multi-agent deep reinforcement learning for sparse-reward environments. Particularly, to enhance the capability of action exploration for agents, the distance reward is designed in our IGC method to increase the density of rewards in action exploration, thereby mitigating the sparse reward problem. In addition, by integrating the Intrinsic Curiosity Module (ICM) into DQN, we propose an enhanced ICM-DQN module, which enhances the breadth and rationality of subject action exploration for agents. By doing this, our IGC method can mitigate the randomness of the existing curiosity mechanism and increase the rationality of action exploration of agents, thereby enhancing the efficiency of action exploration. Finally, we evaluate the effectiveness of our IGC method on an incomplete information card game, namely Uno card game. The results demonstrate that our IGC method can achieve both better action exploration efficiency and greater winning-rate in comparison with existing methods.
AbstractList The sparse reward problem widely exists in multi-agent deep reinforcement learning, preventing agents from learning optimal actions and resulting in inefficient interactions with the environment. Many efforts have been made to design denser rewards and promote agent exploration. However, existing methods only focus on the breadth of action exploration, neglecting the rationality of action exploration in deep reinforcement learning, which leads to inefficient action exploration for agents. To address this issue, in this paper, we propose a novel curiosity-based action exploration method in incomplete information competition game environments, namely IGC, to improve both the breadth and rationality of action exploitation in multi-agent deep reinforcement learning for sparse-reward environments. Particularly, to enhance the capability of action exploration for agents, the distance reward is designed in our IGC method to increase the density of rewards in action exploration, thereby mitigating the sparse reward problem. In addition, by integrating the Intrinsic Curiosity Module (ICM) into DQN, we propose an enhanced ICM-DQN module, which enhances the breadth and rationality of subject action exploration for agents. By doing this, our IGC method can mitigate the randomness of the existing curiosity mechanism and increase the rationality of action exploration of agents, thereby enhancing the efficiency of action exploration. Finally, we evaluate the effectiveness of our IGC method on an incomplete information card game, namely Uno card game. The results demonstrate that our IGC method can achieve both better action exploration efficiency and greater winning-rate in comparison with existing methods.
Author Ye, Yuhao
Zhang, Hanlin
Lin, Jie
Li, Shaobo
Zhao, Peng
Author_xml – sequence: 1
  givenname: Jie
  orcidid: 0000-0003-3476-110X
  surname: Lin
  fullname: Lin, Jie
  email: jielin@mail.xjtu.edu.cn
  organization: Xi'an Jiaotong University, Xi'an, China
– sequence: 2
  givenname: Yuhao
  surname: Ye
  fullname: Ye, Yuhao
  email: yeyuhao0607@stu.xjtu.edu.cn
  organization: Xi'an Jiaotong University, Xi'an, China
– sequence: 3
  givenname: Shaobo
  orcidid: 0009-0003-8470-010X
  surname: Li
  fullname: Li, Shaobo
  email: 4122151031@stu.xjtu.edu.cn
  organization: Xi'an Jiaotong University, Xi'an, China
– sequence: 4
  givenname: Hanlin
  orcidid: 0000-0001-8869-6863
  surname: Zhang
  fullname: Zhang, Hanlin
  email: hanlin@qdu.edu.cn
  organization: Qingdao University, Qingdao, China
– sequence: 5
  givenname: Peng
  orcidid: 0000-0001-7033-9315
  surname: Zhao
  fullname: Zhao, Peng
  email: p.zhao@mail.xjtu.edu.cn
  organization: Xi'an Jiaotong University, Xi'an, China
BookMark eNpNkM1Kw0AUhQepYK19AXGRF0idv5tMllJrDQQEqeAuTMIdHWlmwiQUfXsnbRdd3cPhfJfDuSUz5x0Scs_oijFaPO42u3W54pTDSgAAB3pF5lzmLOUKPmcX-oYsh-GHUsoLYALknHyXXR_8wbqvZPPb733Qo_UusS55RuyTd7TO-NBih25MKtTBTdFoJaVrfdfvccQoo9GdyHU0cbRHvXEHG7yb2OGOXBu9H3B5vgvy8RJ7v6bV27ZcP1Vpyzkb04KDMY3h2uiM80wVUjfYSFlQJTQ0SsbqBlAICbkwjciNyJECUyZGm1yJBeGnv23wwxDQ1H2wnQ5_NaP1NFd9nKue5qrPc0Xo4QRZRLwAikxmKhf_vHxquw
CODEN ITETCU
ContentType Journal Article
DBID 97E
RIA
RIE
AAYXX
CITATION
DOI 10.1109/TETCI.2025.3555250
DatabaseName IEEE All-Society Periodicals Package (ASPP) 2005–Present
IEEE All-Society Periodicals Package (ASPP) 1998–Present
IEEE Xplore
CrossRef
DatabaseTitle CrossRef
DatabaseTitleList
Database_xml – sequence: 1
  dbid: RIE
  name: IEEE/IET Electronic Library (IEL) (UW System Shared)
  url: https://proxy.k.utb.cz/login?url=https://ieeexplore.ieee.org/
  sourceTypes: Publisher
DeliveryMethod fulltext_linktorsrc
EISSN 2471-285X
EndPage 12
ExternalDocumentID 10_1109_TETCI_2025_3555250
10964687
Genre orig-research
GrantInformation_xml – fundername: Open-End Foundation of National Key Laboratory of Air-based Information Perception and Fusion 6A
– fundername: Aviation Foundation of National Key Laboratory of Air-based Information Perception and Fusion
  grantid: ASFC-20240001070002
GroupedDBID 0R~
97E
AAJGR
AASAJ
AAWTH
ABAZT
ABJNI
ABQJQ
ABVLG
ACGFS
AGQYO
AHBIQ
AKJIK
AKQYR
ALMA_UNASSIGNED_HOLDINGS
ATWAV
BEFXN
BFFAM
BGNUA
BKEBE
BPEOZ
EBS
IFIPE
JAVBF
OCL
RIA
RIE
AAYXX
CITATION
EJD
RIG
ID FETCH-LOGICAL-c221t-925ffbf2afa6226894abeb449083a5b84029f5e334573fb37f37e0518f894b783
IEDL.DBID RIE
ISSN 2471-285X
IngestDate Tue Aug 05 12:09:21 EDT 2025
Wed Aug 27 02:03:36 EDT 2025
IsPeerReviewed true
IsScholarly true
Language English
License https://ieeexplore.ieee.org/Xplorehelp/downloads/license-information/IEEE.html
https://doi.org/10.15223/policy-029
https://doi.org/10.15223/policy-037
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-c221t-925ffbf2afa6226894abeb449083a5b84029f5e334573fb37f37e0518f894b783
ORCID 0009-0003-8470-010X
0000-0001-8869-6863
0000-0003-3476-110X
0000-0001-7033-9315
PageCount 12
ParticipantIDs crossref_primary_10_1109_TETCI_2025_3555250
ieee_primary_10964687
PublicationCentury 2000
PublicationDate 2025-00-00
PublicationDateYYYYMMDD 2025-01-01
PublicationDate_xml – year: 2025
  text: 2025-00-00
PublicationDecade 2020
PublicationTitle IEEE transactions on emerging topics in computational intelligence
PublicationTitleAbbrev TETCI
PublicationYear 2025
Publisher IEEE
Publisher_xml – name: IEEE
SSID ssj0002951354
Score 2.2783546
Snippet The sparse reward problem widely exists in multi-agent deep reinforcement learning, preventing agents from learning optimal actions and resulting in...
SourceID crossref
ieee
SourceType Index Database
Publisher
StartPage 1
SubjectTerms Action exploration
Computational intelligence
Convergence
Deep reinforcement learning
Games
incomplete information competition environments
Neural networks
Object recognition
Q-learning
Reviews
reward design
Silicon carbide
sparse reward
Training
Title Improving Exploration in Deep Reinforcement Learning for Incomplete Information Competition Environments
URI https://ieeexplore.ieee.org/document/10964687
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
link http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV1LSwMxEA62Jy8-sGJ9kYM32XU3j30cpbZUwR6khd6WnXSiImyLbC_-eifZbamC4C2EBMJMMvlmMt-EsRtFfiyZOQwIDcuArB8EIGIbgEYylJHUZuH4zs-TZDxTT3M9b8nqnguDiD75DEPX9G_5i6VZu1AZnfA8UUmWdliH9llD1toGVARhBanVhhgT5XfT4XTwSC6g0CHdqu797sfls_Obir9MRodssllGk0PyEa5rCM3XrwqN_17nETtoYSW_b_bBMdvD6oS9bSMGvEm181rg7xV_QFzxF_RVU40PEPK20Oorpy5OVsOlmhOg5i1fyc8ceJDtk7z4cIch12OzEUlhHLQ_KwRGiLgOcqGtBStKWyaEv7JclYCg3COgLDWQ0ydyq1FKpVNpQaZWpkjHN7M0FNJMnrJutazwjHEVL4w0SoFVqEqtybrGqCMwJUiIS-iz243Ii1VTQKPwjkeUF15BhVNQ0Sqoz3pOnDsjG0me_9F_wfbd9CYmcsm69ecarwgl1HDtd8c3aDO8Xw
linkProvider IEEE
linkToHtml http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV1LS8NAEF60HvTiAyvW5x68SWKzjzyOUi2ttj1IC72FTDqrIqRF0ou_3tlNWqogeAtLEpaZ3ZlvZuebZexGURxLZg49QsPSI-sHHojAeKCRDGVb6nxm-c7DUdibqKepntZkdceFQURXfIa-fXRn-bN5vrSpMtrhSajCONpmO-T4la7oWuuUiiC0ILVaUWPayd34cdzpUxAotE9-1Z7g_XA_G_epOHfSPWCj1USqKpIPf1mCn3_96tH475kesv0aWPL7aiUcsS0sjtnbOmfAq2I7pwf-XvAHxAV_Qdc3NXcpQl63Wn3lNMTJbthic4LUvGYsuS87Dma7Mi_-uMGRa7JJl6TQ8-q7FbxciKD0EqGNASMyk4WEwOJEZYCg7DGgzDRQ2CcSo1FKpSNpQEZGRkgbODb0KkSxPGGNYl7gKeMqmOUyVwqMQpVpTfY1QN2GPAMJQQYtdrsSebqoWmikLvRoJ6lTUGoVlNYKarGmFefGm5Ukz_4Yv2a7vfFwkA76o-dztmd_VWVILlij_FziJWGGEq7cSvkGb_S_rA
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Improving+Exploration+in+Deep+Reinforcement+Learning+for+Incomplete+Information+Competition+Environments&rft.jtitle=IEEE+transactions+on+emerging+topics+in+computational+intelligence&rft.au=Lin%2C+Jie&rft.au=Ye%2C+Yuhao&rft.au=Li%2C+Shaobo&rft.au=Zhang%2C+Hanlin&rft.date=2025&rft.pub=IEEE&rft.eissn=2471-285X&rft.spage=1&rft.epage=12&rft_id=info:doi/10.1109%2FTETCI.2025.3555250&rft.externalDocID=10964687
thumbnail_l http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=2471-285X&client=summon
thumbnail_m http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=2471-285X&client=summon
thumbnail_s http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=2471-285X&client=summon