Performance Analysis of Trial and Error Algorithms

Model-free decentralized optimizations and learning are receiving increasing attention from theoretical and practical perspectives. In particular, two fully decentralized learning algorithms, namely Trial and Error Learning (TEL) and Optimal Dynamical Learning (ODL), are very appealing for a broad c...

Full description

Saved in:

Bibliographic Details
Published in	IEEE transactions on parallel and distributed systems Vol. 31; no. 6; pp. 1343 - 1356
Main Authors	Gaveau, Jerome, Le Martret, Christophe J., Assaad, Mohamad
Format	Journal Article
Language	English
Published	New York IEEE 01.06.2020 The Institute of Electrical and Electronics Engineers, Inc. (IEEE) Institute of Electrical and Electronics Engineers
Subjects	Algorithms Approximation algorithms Computer Science Convergence Error analysis game theory Games Machine learning markov chain Markov chains Markov processes Mathematical model Model-free optimization multi-agent system Networking and Internet Architecture Optimization Performance measurement Players Resource management trial and error Utilities
Online Access	Get full text

Cover

Loading…

More Information
Summary:	Model-free decentralized optimizations and learning are receiving increasing attention from theoretical and practical perspectives. In particular, two fully decentralized learning algorithms, namely Trial and Error Learning (TEL) and Optimal Dynamical Learning (ODL), are very appealing for a broad class of games. Indeed, ODL has the property to spend a high proportion of time in an optimum state that maximizes the sum of the utilities of all players, whereas, TEL has the property to spend a high proportion of time in an optimum state that maximizes the sum of the utilities of all players if there is a pure Nash equilibrium, otherwise, it spends a high proportion of time in a state that maximizes a trade-off between the sum of the utilities of the players and a predefined stability function. On the other hand, estimating the mean fraction of time spent in the optimum state (as well as the mean time duration to reach it) is challenging due to the high complexity and dimension of the inherent Markov chains. In this article, under some specific system model, an evaluation of the above performance metrics is provided by proposing an approximation of the considered Markov chains, which allows overcoming the problem of high dimensionality. A comparison between the two algorithms is then performed which allows a better understanding of their performance.
ISSN:	1045-9219 1558-2183
DOI:	10.1109/TPDS.2020.2964256