Weapon-Target Assignment Strategy in Joint Combat Decision-Making Based on Multi-Head Deep Reinforcement Learning

In response to the modeling difficulties and low search efficiency of traditional weapon-target assignment algorithms, this paper proposes a deep reinforcement learning-based intelligent weapon-target assignment method. A weapon-target intelligent assignment model with strong decision-making capabil...

Full description

Saved in:

Bibliographic Details
Published in	IEEE access Vol. 11; pp. 113740 - 113751
Main Authors	Li, Shuai, He, Xiaoyuan, Xu, Xiao, Zhao, Tan, Song, Chenye, Li, Jiabao
Format	Journal Article
Language	English
Published	Piscataway IEEE 2023 The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Subjects	Algorithms Assignment problem Computational modeling Constraint modelling Decision making Deep learning deep reinforcement learning Discrete wavelet transforms Efficiency Heuristic algorithms Machine learning Markov processes mission planning Object detection Operations research Optimization Optimization models Reinforcement learning Resource management Strategic planning Target tracking Weapon target allocation Weapons
Online Access	Get full text

Cover

Loading…

More Information
Summary:	In response to the modeling difficulties and low search efficiency of traditional weapon-target assignment algorithms, this paper proposes a deep reinforcement learning-based intelligent weapon-target assignment method. A weapon-target intelligent assignment model with strong decision-making capabilities (RL4WTA) is obtained by training. Firstly, a multi-constraint weapon-target assignment optimization model is established to discretize the dynamic weapon-target assignment problem into a static weapon-target assignment problem. Furthermore, a planning and solving environment for the weapon-target assignment (WTA) problem is designed, and a Markov Decision Process (MDP) for WTA tasks is constructed based on the planning and solving model. This provides a foundation for solving the WTA problem using reinforcement learning algorithms. Additionally, a reinforcement learning-based WTA-solving model is proposed in this paper. By utilizing a multi-head Q-value network, the complex joint decision space is decoupled, thereby improving the efficiency of the WTA model. The use of a masking mechanism allows for inferring valid actions that satisfy the constraint conditions under the current situation, reducing uncertainty during the reinforcement learning training process. Experimental results show that the proposed model, RL4WTA, can generate satisfactory solutions adaptively in both small-scale and large-scale scenarios. Compared with traditional optimization algorithms, the model is superior in adaptability and computational efficiency, meeting the requirements of making optimal decisions for weapon-target assignment problems.
ISSN:	2169-3536 2169-3536
DOI:	10.1109/ACCESS.2023.3324193