Dynamic Coverage Meets Regret: Unifying Two Control Performance Measures for Mobile Agents in Spatiotemporally Varying Environments

Numerous mobile robotic applications require agents to persistently explore and exploit spatiotemporally varying, partially observable environments. Ultimately, the mathematical notion of regret, which quite simply represents the instantaneous or time-averaged difference between the optimal reward a...

Full description

Saved in:

Bibliographic Details
Published in	2021 60th IEEE Conference on Decision and Control (CDC) pp. 521 - 526
Main Authors	Haydon, Ben, Mishra, Kirti D., Keyantuo, Patrick, Panagou, Dimitra, Chow, Fotini, Moura, Scott, Vermillion, Chris
Format	Conference Proceeding
Language	English
Published	IEEE 14.12.2021
Subjects	Conferences Correlation Heuristic algorithms Machine learning Mobile agents Spatiotemporal phenomena Wind energy
Online Access	Get full text

Cover

Loading…

More Information
Summary:	Numerous mobile robotic applications require agents to persistently explore and exploit spatiotemporally varying, partially observable environments. Ultimately, the mathematical notion of regret, which quite simply represents the instantaneous or time-averaged difference between the optimal reward and realized reward, serves as a meaningful measure of how well the agents have exploited the environment. However, while numerous theoretical regret bounds have been derived within the machine learning community, restrictions on the manner in which the environment evolves preclude their application to persistent missions. On the other hand, meaningful theoretical properties can be derived for the related concept of dynamic coverage, which serves as an exploration measurement but does not have an immediately intuitive connection with regret. In this paper, we demonstrate a clear correlation between an appropriately defined measure of dynamic coverage and regret, then go on to derive performance bounds on dynamic coverage as a function of the environmental parameters. We evaluate the correlation for several variants of an airborne wind energy system, for which the objective is to adjust the operating altitude in order to maximize power output in a spatiotemporally evolving wind field.
ISSN:	2576-2370
DOI:	10.1109/CDC45484.2021.9682826