Simultaneous Learning and Planning in a Hierarchical Control System for a Cognitive Agent

The tasks of behavior planning and decision-making learning in a dynamic environment are usually divided and considered separately in control systems for intelligent agents. A new unified hierarchical formulation of the problem of simultaneous learning and planning (SLAP) is proposed in the context...

Full description

Saved in:
Bibliographic Details
Published inAutomation and remote control Vol. 83; no. 6; pp. 869 - 883
Main Author Panov, A. I.
Format Journal Article
LanguageEnglish
Published Moscow Pleiades Publishing 01.06.2022
Springer Nature B.V
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:The tasks of behavior planning and decision-making learning in a dynamic environment are usually divided and considered separately in control systems for intelligent agents. A new unified hierarchical formulation of the problem of simultaneous learning and planning (SLAP) is proposed in the context of object-oriented reinforcement learning, and an architecture of a cognitive agent that solves this problem is described. A new algorithm for learning actions in a partially observed external environment is proposed using a reward signal, an object-oriented subject description of the states of the external environment, and dynamically updated action plans. The main properties and advantages of the proposed algorithm are considered, including the lack of a fixed cognitive cycle necessitating the separation of planning and learning subsystems in earlier algorithms and the ability to construct and update the model of interaction with the environment, thus increasing the learning efficiency. A theoretical justification of some provisions of this approach is given, a model example is proposed, and the principle of operation of a SLAP agent when driving an unmanned vehicle is demonstrated.
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
ISSN:0005-1179
1608-3032
DOI:10.1134/S0005117922060054