Multi-Agent Cooperative Search and Target Tracking Based On Hierarchical Constrained Policy

Current cooperative multi-agent algorithms consider the sum of team rewards as the joint objective. These algorithms assumes a consistent monotonicity between individual value functions and team value functions. However, monotonicity constraints are unable to accurately fit scenarios in autonomous s...

Full description

Saved in:

Bibliographic Details
Published in	2024 4th Asia Conference on Information Engineering (ACIE) pp. 149 - 156
Main Authors	Zhang, Shiyu, Feng, Min, Zhang, Qi, Zhang, Jingyi, Pu, Shi, Wei, Miaomiao
Format	Conference Proceeding
Language	English
Published	IEEE 26.01.2024
Subjects	Asia Collaboration Intelligent agents Iterative methods Linear programming Multi-agent Coordination Optimization methods Reinforcement Learning Target Searching Target tracking
Online Access	Get full text

Cover

Loading…

More Information
Summary:	Current cooperative multi-agent algorithms consider the sum of team rewards as the joint objective. These algorithms assumes a consistent monotonicity between individual value functions and team value functions. However, monotonicity constraints are unable to accurately fit scenarios in autonomous search where (1) Agents influence each other during action execution and (2) Individual rewards conflict with team rewards. Hence, we proposed a method with hierarchical policy constrains. Locally, to avoid path conflicts caused by agent policies, we use a weighted sum of individual rewards and the rewards of neighboring agents that may mutually influence each other as the reward function for an individual agent. Globally, to prevent individual agent policies from deviating from the team's expectations, we introduce a constraint on this disparity in the objective function for individual policy updates. Experimental results in simulation of UAVs cooperative searching show that our method outperforms the baseline methods that can greatly promote task success rate and region coverage rate.
DOI:	10.1109/ACIE61839.2024.00032