The actor-critic algorithm as multi-time-scale stochastic approximation

The actor-critic algorithm of Barto and others for simulation-based optimization of Markov decision processes is cast as a two time scale stochastic approximation. Convergence analysis, approximation issues and an example are studied.

Saved in:

Bibliographic Details
Published in	Sadhana (Bangalore) Vol. 22; no. 4; pp. 525 - 543
Main Authors	Borkar, Vivek S, Konda, Vijaymohan R
Format	Journal Article
Language	English
Published	Dordrecht Springer Nature B.V 01.08.1997
Subjects	Algorithms Approximation Markov processes Mathematical analysis Optimization
Online Access	Get full text

Cover

Loading…

More Information
Summary:	The actor-critic algorithm of Barto and others for simulation-based optimization of Markov decision processes is cast as a two time scale stochastic approximation. Convergence analysis, approximation issues and an example are studied.
ISSN:	0256-2499 0973-7677
DOI:	10.1007/BF02745577