Discrete-Time Local Value Iteration Adaptive Dynamic Programming: Convergence Analysis

In this paper, convergence properties are established for the newly developed discrete-time local value iteration adaptive dynamic programming (ADP) algorithm. The present local iterative ADP algorithm permits an arbitrary positive semidefinite function to initialize the algorithm. Employing a state...

Full description

Saved in:
Bibliographic Details
Published inIEEE transactions on systems, man, and cybernetics. Systems Vol. 48; no. 6; pp. 875 - 891
Main Authors Wei, Qinglai, Lewis, Frank L., Liu, Derong, Song, Ruizhuo, Lin, Hanquan
Format Journal Article
LanguageEnglish
Published New York IEEE 01.06.2018
The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:In this paper, convergence properties are established for the newly developed discrete-time local value iteration adaptive dynamic programming (ADP) algorithm. The present local iterative ADP algorithm permits an arbitrary positive semidefinite function to initialize the algorithm. Employing a state-dependent learning rate function, for the first time, the iterative value function and iterative control law can be updated in a subset of the state space instead of the whole state space, which effectively relaxes the computational burden. A new analysis method for the convergence property is developed to prove that the iterative value functions will converge to the optimum under some mild constraints. Monotonicity of the local value iteration ADP algorithm is presented, which shows that under some special conditions of the initial value function and the learning rate function, the iterative value function can monotonically converge to the optimum. Finally, three simulation examples and comparisons are given to illustrate the performance of the developed algorithm.
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
ISSN:2168-2216
2168-2232
DOI:10.1109/TSMC.2016.2623766