A Vision-Based Attention Deep Q-Network with Prior-Based Knowledge

In order to unveil the intrinsic workings of deep reinforcement learning(DRL) models and explain the regions of interest attended by the agent during the decision-making process, vision-based RL employs attention mechanisms. However, due to policy optimization leading to changes in the data domain,...

Full description

Saved in:
Bibliographic Details
Published in2023 China Automation Congress (CAC) pp. 6155 - 6160
Main Authors Ma, Jialin, Li, Ce, Hong, Liang, Wei, Kailun, Zhao, Shutian, Jiang, Hangfei
Format Conference Proceeding
LanguageEnglish
Published IEEE 17.11.2023
Subjects
Online AccessGet full text

Cover

Loading…
Abstract In order to unveil the intrinsic workings of deep reinforcement learning(DRL) models and explain the regions of interest attended by the agent during the decision-making process, vision-based RL employs attention mechanisms. However, due to policy optimization leading to changes in the data domain, the agent may even fail to learn a policy. To address this, a vision-based attention deep Q-network(VADQN) method with a prior-based mechanism is proposed. Firstly, prior attention maps are obtained using a learnable Gaussian filter and spectral residual method. Nextly, the attention maps are fine-tuned using a self-attention mechanism to improve their performance. During RL training, both the attention maps and the parameters of the policy network are simultaneously trained to ensure explanations of the regions of interest during online training. Finally, a series of ablation experiments were conducted on atari games to compare the proposed method with human, nature convolutional neural network, and other approaches. The results demonstrate that our proposed method not only reveals the regions of interest attended by DRL during the decision-making process but also enhances DRL performance in certain scenarios.
AbstractList In order to unveil the intrinsic workings of deep reinforcement learning(DRL) models and explain the regions of interest attended by the agent during the decision-making process, vision-based RL employs attention mechanisms. However, due to policy optimization leading to changes in the data domain, the agent may even fail to learn a policy. To address this, a vision-based attention deep Q-network(VADQN) method with a prior-based mechanism is proposed. Firstly, prior attention maps are obtained using a learnable Gaussian filter and spectral residual method. Nextly, the attention maps are fine-tuned using a self-attention mechanism to improve their performance. During RL training, both the attention maps and the parameters of the policy network are simultaneously trained to ensure explanations of the regions of interest during online training. Finally, a series of ablation experiments were conducted on atari games to compare the proposed method with human, nature convolutional neural network, and other approaches. The results demonstrate that our proposed method not only reveals the regions of interest attended by DRL during the decision-making process but also enhances DRL performance in certain scenarios.
Author Hong, Liang
Jiang, Hangfei
Zhao, Shutian
Wei, Kailun
Ma, Jialin
Li, Ce
Author_xml – sequence: 1
  givenname: Jialin
  surname: Ma
  fullname: Ma, Jialin
  organization: School of Electrical Engineering and Information Engineering, Lanzhou University of Technology,Lanzhou,China
– sequence: 2
  givenname: Ce
  surname: Li
  fullname: Li, Ce
  organization: School of Electrical Engineering and Information Engineering, Lanzhou University of Technology,Lanzhou,China
– sequence: 3
  givenname: Liang
  surname: Hong
  fullname: Hong, Liang
  organization: School of Electrical Engineering and Information Engineering, Lanzhou University of Technology,Lanzhou,China
– sequence: 4
  givenname: Kailun
  surname: Wei
  fullname: Wei, Kailun
  organization: School of Electrical Engineering and Information Engineering, Lanzhou University of Technology,Lanzhou,China
– sequence: 5
  givenname: Shutian
  surname: Zhao
  fullname: Zhao, Shutian
  organization: School of Electrical Engineering and Information Engineering, Lanzhou University of Technology,Lanzhou,China
– sequence: 6
  givenname: Hangfei
  surname: Jiang
  fullname: Jiang, Hangfei
  organization: School of Electrical Engineering and Information Engineering, Lanzhou University of Technology,Lanzhou,China
BookMark eNo1j9tKw0AURUdRsNb8gcj8wMRz5swkM49prBcsXlB8Lbmc6GhNShII_r0F69Niw2LDOhVHbdeyEBcIMSL4yzzLrbfWxho0xQjGIpI-EJFPvSMLBJRafyhmOnFOgSd3IqJh-ATY-WisgZlYZPItDKFr1aIYuJbZOHI77ra8Yt7KZ_XA49T1X3IK44d86kPX7837tps2XL_zmThuis3A0Z5z8XK9fM1v1erx5i7PViog-lEVpE3TmIZ1YkxNviZngFzlydRVBVimOqnBWUdJyqYxVBQpluhsqTlJaC7O_14DM6-3ffgu-p_1fzT9AtvaS1Y
ContentType Conference Proceeding
DBID 6IE
6IL
CBEJK
RIE
RIL
DOI 10.1109/CAC59555.2023.10451132
DatabaseName IEEE Electronic Library (IEL) Conference Proceedings
IEEE Proceedings Order Plan All Online (POP All Online) 1998-present by volume
IEEE Xplore All Conference Proceedings
IEEE Electronic Library (IEL)
IEEE Proceedings Order Plans (POP All) 1998-Present
DatabaseTitleList
Database_xml – sequence: 1
  dbid: RIE
  name: IEEE Electronic Library (IEL)
  url: https://proxy.k.utb.cz/login?url=https://ieeexplore.ieee.org/
  sourceTypes: Publisher
DeliveryMethod fulltext_linktorsrc
EISBN 9798350303759
EISSN 2688-0938
EndPage 6160
ExternalDocumentID 10451132
Genre orig-research
GrantInformation_xml – fundername: National Natural Science Foundation of China
  grantid: 62363025
  funderid: 10.13039/501100001809
– fundername: Gansu Education Department, China
  grantid: 2023CXZX-468
  funderid: 10.13039/501100009590
GroupedDBID 6IE
6IF
6IL
6IN
ABLEC
ADZIZ
ALMA_UNASSIGNED_HOLDINGS
BEFXN
BFFAM
BGNUA
BKEBE
BPEOZ
CBEJK
CHZPO
IEGSK
OCL
RIE
RIL
ID FETCH-LOGICAL-i119t-a324ff4fe2644d39d384038c934dcc01b726d0858367e4f43aa71b185b2e663
IEDL.DBID RIE
IngestDate Wed Jun 26 19:43:00 EDT 2024
IsPeerReviewed false
IsScholarly false
Language English
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-i119t-a324ff4fe2644d39d384038c934dcc01b726d0858367e4f43aa71b185b2e663
PageCount 6
ParticipantIDs ieee_primary_10451132
PublicationCentury 2000
PublicationDate 2023-Nov.-17
PublicationDateYYYYMMDD 2023-11-17
PublicationDate_xml – month: 11
  year: 2023
  text: 2023-Nov.-17
  day: 17
PublicationDecade 2020
PublicationTitle 2023 China Automation Congress (CAC)
PublicationTitleAbbrev CAC
PublicationYear 2023
Publisher IEEE
Publisher_xml – name: IEEE
SSID ssj0002314540
Score 1.8994219
Snippet In order to unveil the intrinsic workings of deep reinforcement learning(DRL) models and explain the regions of interest attended by the agent during the...
SourceID ieee
SourceType Publisher
StartPage 6155
SubjectTerms atari games
Decision making
deep reinforcement learning
Games
Optimization
prior-based
self-attention
Solid modeling
Solids
Task analysis
Training
Vision-based attention
Title A Vision-Based Attention Deep Q-Network with Prior-Based Knowledge
URI https://ieeexplore.ieee.org/document/10451132
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
link http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV1LSwMxEA7akycVK77JwWvWZpPdJMe2WopiUXzQW0k2EyhCW8r24q93so-KguAthIQkhOT7MplvhpBrz63QTgcGPi3wgSI9c9iWIVmIlF87mUft8OMkH7_J-2k2bcTqlRYGACrnM0hisfrL98tiE01leMJjNC2BN-6uMqYWa20NKkhUYjS5RgXMe-Zm2B9mJsuyJKYIT9rOP9KoVCgy2ieTdvzaeeQj2ZQuKT5_hWb89wQPSPdbsEeftlB0SHZgcUQGffpeKcfZAKHK035Z1r6N9BZgRZ_ZpHYBp9EWi93ny3XT8qE1tHXJy-judThmTcoENufclMwiPwpBBog8xwvjBT7ghC6MkL4oetypNPfIsrTIFcgghbWKO8RslwJyj2PSWSwXcEJo0Dyit0x7ysqca5tqnwpps8BBKWFPSTcuf7aqY2LM2pWf_VF_TvbiLkQVH1cXpFOuN3CJcF66q2obvwA6NZxW
link.rule.ids 310,311,783,787,792,793,799,27939,55088
linkProvider IEEE
linkToHtml http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV1NSwMxEA1SD3pSseK3e_Catdkkm-yxrZZq20WxSm8l2SRQhG0p24u_3sx-VBQEbyFkICEk72UybwahW0MUlVo6bE2U-QcKM1j7sdiTBaD8UrMYtMOTNB6-sacZn9Vi9VILY60tg89sCM3yL98ssw24yvwJh2xa1N-4uxyIRSXX2rpUPFWBfHK1Dph0krt-t88TznkIRcLDxvxHIZUSRwYHKG1mUIWPfISbQofZ56_kjP-e4iFqf0v2guctGB2hHZsfo143eC-147jnwcoE3aKoohuDe2tXwQtOqyDwALyx3nyxXNcjR42rrY1eBw_T_hDXRRPwgpCkwMozJOeYs8B0DE0M9U84KrOEMpNlHaJFFBvPsySNhWWOUaUE0R61dWQ9-zhBrXyZ21MUOEkAv1nUEYrFRKpImogyxR2xQlB1htqw_Pmqyooxb1Z-_kf_DdobTifj-fgxHV2gfdgR0PQRcYlaxXpjrzy4F_q63NIvoROfow
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=proceeding&rft.title=2023+China+Automation+Congress+%28CAC%29&rft.atitle=A+Vision-Based+Attention+Deep+Q-Network+with+Prior-Based+Knowledge&rft.au=Ma%2C+Jialin&rft.au=Li%2C+Ce&rft.au=Hong%2C+Liang&rft.au=Wei%2C+Kailun&rft.date=2023-11-17&rft.pub=IEEE&rft.eissn=2688-0938&rft.spage=6155&rft.epage=6160&rft_id=info:doi/10.1109%2FCAC59555.2023.10451132&rft.externalDocID=10451132