Policy Sharing Using Aggregation Trees for -Learning in a Continuous State and Action Spaces

Q-learning is a generic approach that uses a finite discrete state and an action domain to estimate action values using tabular or function approximation methods. An intelligent agent eventually learns policies from continuous sensory inputs and encodes these environmental inputs onto a discrete sta...

Full description

Saved in:
Bibliographic Details
Published inIEEE transactions on cognitive and developmental systems Vol. 12; no. 3; pp. 474 - 485
Main Authors Chen, Yu-Jen, Jiang, Wei-Cheng, Ju, Ming-Yi, Hwang, Kao-Shing
Format Journal Article
LanguageEnglish
Published Piscataway IEEE 01.09.2020
The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:Q-learning is a generic approach that uses a finite discrete state and an action domain to estimate action values using tabular or function approximation methods. An intelligent agent eventually learns policies from continuous sensory inputs and encodes these environmental inputs onto a discrete state space. The application of Q-learning in a continuous state/action domain is the subject of many studies. This paper uses a tree structure to approximate a Q-function using in a continuous state domain. The agent selects a discretized action with a maximum Q-value and this discretized action is then extended to a continuous action using an action bias function. Reinforcement learning is difficult for a single agent when the state space is huge. This proposed architecture is also applied to a multiagent system, wherein an individual agent transfers its useful Q-values to other agents to accelerate the learning process. Policy is shared between agents by grafting the branches of trees in which Q-values are stored to other trees. The results for simulation show that the proposed architecture performs better than tabular Q-learning and significantly accelerates the learning process because all agents use the sharing mechanisms to cooperate with each other.
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
ISSN:2379-8920
2379-8939
DOI:10.1109/TCDS.2019.2926477