A set of novel continuous action-set reinforcement learning automata models to optimize continuous functions

Learning automata (LA) as a powerful tool for reinforcement learning which belongs to the subject of Artificial Intelligence, could search for the optimal state adaptively in a random environment. In the past decades quite a few FALA algorithms are maturely developed but exposing critical defects, w...

Full description

Saved in:

Bibliographic Details
Published in	Applied intelligence (Dordrecht, Netherlands) Vol. 46; no. 4; pp. 845 - 864
Main Authors	Guo, Ying, Ge, Hao, Li, Shenghong
Format	Journal Article
Language	English
Published	New York Springer US 01.06.2017 Springer Nature B.V
Subjects	Accuracy Algorithms Artificial Intelligence Computer Science Continuity (mathematics) Learning Machines Manufacturing Mathematical models Mechanical Engineering Optimization Optimization algorithms Processes Random variables Reinforcement Searching Learning automata CALA Function optimization Artificial intelligence Reinforcement learning
Online Access	Get full text
ISSN	0924-669X 1573-7497
DOI	10.1007/s10489-016-0853-4

Cover

Loading…

More Information
Summary:	Learning automata (LA) as a powerful tool for reinforcement learning which belongs to the subject of Artificial Intelligence, could search for the optimal state adaptively in a random environment. In the past decades quite a few FALA algorithms are maturely developed but exposing critical defects, when they are applied to optimize continuous functions. In order to overcome their shortcomings and explore a higher-performance LA, we propose a novel CALA algorithm to solve the function optimization problems via one kind of LA prototypes, i.e, the continuous action-set reinforcement learning automata, which is abbreviated as CARLA. The key mechanism of the proposed algorithm lies in a combination of equidistant discretization and linear interpolation. Specifically, four categories of application models are constructed. Two of them are created to obtain continuous actions when the priori information is finite ones, thus avoiding the drawbacks of FALA. The realization of this functionality recourses to the so-called cumulative distribution function (CDF) and a new concept of area surrounded by curves (AsbC) respectively. The other two models are modified versions to balance the trade-off between accuracy and speed. Moreover, these models are expanded to their generalized versions so that multidimensional function optimization problems can be handled as well. A massive amount of experiments including four benchmarks and three scenarios are designed to demonstrate the effectiveness and efficiency of the proposed application models. The proposed algorithm outperforms the state of the arts of LA as well as optimization algorithms, with a high accuracy rate, a fast convergence speed, and a competitive time consumption, especially in noised environments.
Bibliography:	ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14 content type line 23
ISSN:	0924-669X 1573-7497
DOI:	10.1007/s10489-016-0853-4