The actor-critic algorithm as multi-time-scale stochastic approximation
The actor-critic algorithm of Barto and others for simulation-based optimization of Markov decision processes is cast as a two time scale stochastic approximation. Convergence analysis, approximation issues and an example are studied.
Saved in:
Published in | Sadhana (Bangalore) Vol. 22; no. 4; pp. 525 - 543 |
---|---|
Main Authors | , |
Format | Journal Article |
Language | English |
Published |
Dordrecht
Springer Nature B.V
01.08.1997
|
Subjects | |
Online Access | Get full text |
Cover
Loading…
Summary: | The actor-critic algorithm of Barto and others for simulation-based optimization of Markov decision processes is cast as a two time scale stochastic approximation. Convergence analysis, approximation issues and an example are studied. |
---|---|
ISSN: | 0256-2499 0973-7677 |
DOI: | 10.1007/BF02745577 |