The actor-critic algorithm as multi-time-scale stochastic approximation

The actor-critic algorithm of Barto and others for simulation-based optimization of Markov decision processes is cast as a two time scale stochastic approximation. Convergence analysis, approximation issues and an example are studied.

Saved in:
Bibliographic Details
Published inSadhana (Bangalore) Vol. 22; no. 4; pp. 525 - 543
Main Authors Borkar, Vivek S, Konda, Vijaymohan R
Format Journal Article
LanguageEnglish
Published Dordrecht Springer Nature B.V 01.08.1997
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:The actor-critic algorithm of Barto and others for simulation-based optimization of Markov decision processes is cast as a two time scale stochastic approximation. Convergence analysis, approximation issues and an example are studied.
ISSN:0256-2499
0973-7677
DOI:10.1007/BF02745577