An Intelligent Bait Delivery Control Method for Flight Vehicle Evasion Based on Reinforcement Learning

During aerial combat, when an aircraft is facing an infrared air-to-air missile strike, infrared baiting technology is an important means of penetration, and the strategy of effective delivery of infrared bait is critical. To address this issue, this study proposes an improved deep deterministic pol...

Full description

Saved in:
Bibliographic Details
Published inAerospace Vol. 11; no. 8; p. 653
Main Authors Xue, Shuai, Wang, Zhaolei, Bai, Hongyang, Yu, Chunmei, Deng, Tianyu, Sun, Ruisheng
Format Journal Article
LanguageEnglish
Published Basel MDPI AG 01.08.2024
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:During aerial combat, when an aircraft is facing an infrared air-to-air missile strike, infrared baiting technology is an important means of penetration, and the strategy of effective delivery of infrared bait is critical. To address this issue, this study proposes an improved deep deterministic policy gradient (DDPG) algorithm-based intelligent bait-dropping control method. Firstly, by modeling the relative motion between aircraft, bait, and incoming missiles, the Markov decision process of aircraft-bait-missile infrared effect was constructed with visual distance and line of sight angle as states. Then, the DDPG algorithm was improved by means of pre-training and classification sampling. Significantly, the infrared bait-dropping decision network was trained through interaction with the environment and iterative learning, which led to the development of the bait-dropping strategy. Finally, the corresponding environment was transferred to the Nvidia Jetson TX2 embedded platform for comparative testing. The simulation results showed that the convergence speed of this method was 46.3% faster than the traditional DDPG algorithm. More importantly, it was able to generate an effective bait-throwing strategy, enabling the aircraft to successfully evade the attack of the incoming missile. The strategy instruction generation time is only about 2.5 ms, giving it the ability to make online decisions.
ISSN:2226-4310
2226-4310
DOI:10.3390/aerospace11080653