Weapon-Target Assignment Strategy in Joint Combat Decision-Making Based on Multi-Head Deep Reinforcement Learning

In response to the modeling difficulties and low search efficiency of traditional weapon-target assignment algorithms, this paper proposes a deep reinforcement learning-based intelligent weapon-target assignment method. A weapon-target intelligent assignment model with strong decision-making capabil...

Full description

Saved in:
Bibliographic Details
Published inIEEE access Vol. 11; pp. 113740 - 113751
Main Authors Li, Shuai, He, Xiaoyuan, Xu, Xiao, Zhao, Tan, Song, Chenye, Li, Jiabao
Format Journal Article
LanguageEnglish
Published Piscataway IEEE 2023
The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:In response to the modeling difficulties and low search efficiency of traditional weapon-target assignment algorithms, this paper proposes a deep reinforcement learning-based intelligent weapon-target assignment method. A weapon-target intelligent assignment model with strong decision-making capabilities (RL4WTA) is obtained by training. Firstly, a multi-constraint weapon-target assignment optimization model is established to discretize the dynamic weapon-target assignment problem into a static weapon-target assignment problem. Furthermore, a planning and solving environment for the weapon-target assignment (WTA) problem is designed, and a Markov Decision Process (MDP) for WTA tasks is constructed based on the planning and solving model. This provides a foundation for solving the WTA problem using reinforcement learning algorithms. Additionally, a reinforcement learning-based WTA-solving model is proposed in this paper. By utilizing a multi-head Q-value network, the complex joint decision space is decoupled, thereby improving the efficiency of the WTA model. The use of a masking mechanism allows for inferring valid actions that satisfy the constraint conditions under the current situation, reducing uncertainty during the reinforcement learning training process. Experimental results show that the proposed model, RL4WTA, can generate satisfactory solutions adaptively in both small-scale and large-scale scenarios. Compared with traditional optimization algorithms, the model is superior in adaptability and computational efficiency, meeting the requirements of making optimal decisions for weapon-target assignment problems.
ISSN:2169-3536
2169-3536
DOI:10.1109/ACCESS.2023.3324193