UAV Pursuit-Evasion Game based on M2GPI algorithm
The Unmanned Aerial Vehicle (UAV) technology is one of the research hotspots in recent years. UAV has become more intelligent, more widely used in the military and more difficult to defend against. As a typical differential game in air combat, the one-to-one pursuit-evasion game of UAVs has been wid...
Saved in:
Published in | 2024 36th Chinese Control and Decision Conference (CCDC) pp. 795 - 800 |
---|---|
Main Authors | , , , , |
Format | Conference Proceeding |
Language | English |
Published |
IEEE
25.05.2024
|
Subjects | |
Online Access | Get full text |
Cover
Loading…
Summary: | The Unmanned Aerial Vehicle (UAV) technology is one of the research hotspots in recent years. UAV has become more intelligent, more widely used in the military and more difficult to defend against. As a typical differential game in air combat, the one-to-one pursuit-evasion game of UAVs has been widely concerned. In order to solve the one-to-one pursuitevasion game of UAVs, we use Minimax Q algorithm effectively combining with deep neural network and iterative updating of generalized policy, and propose an improved Mini-Max Q network learning algorithm based on Generalized Policy Iteration and fitted Q function (M2GPI) algorithm. Based on the classic Minimax Q algorithm, M2GPI algorithm makes two contributions: (1) the introduction of neural network to fit Q function, instead of Minimax Q algorithm Q table form, so that the algorithm can be applied to large-scale data problems. (2) Generalized policy iteration is introduced to solve the Nash equilibrium solution of both agents at each moment, which improves the updating efficiency of the algorithm. M2GPI algorithm obtains an effective policy by replacing the optimal solution with the equilibrium solution in game theory, which not only improves the convergence efficiency but also makes the policy reasonable. Experimental results show that M2GPI algorithm is superior to Minimax Q algorithm in convergence speed and success rate of tasks, which proves the rationality and superiority of M2GPI algorithm. |
---|---|
ISSN: | 1948-9447 |
DOI: | 10.1109/CCDC62350.2024.10587795 |