Adaptation of Frequency Hopping Interval for Radar Anti-Jamming Based on Reinforcement Learning

In modern electronic warfare, it is becoming very important to develop intelligent and adaptive radar anti-jamming methods since jammers can now launch increasingly complex and unpredictable attacks. Besides, in practice, the jamming strategy is usually unknown to the radar. To overcome the limitati...

Full description

Saved in:

Bibliographic Details
Published in	IEEE transactions on vehicular technology Vol. 71; no. 12; pp. 12434 - 12449
Main Authors	Ailiya, Yi, Wei, Varshney, Pramod K.
Format	Journal Article
Language	English
Published	New York IEEE 01.12.2022 The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Subjects	Adaptation Adaptation models Decision making Efficiency Electronic warfare Frequency agile radar Frequency hopping Interception Jammers Jamming Learning Markov processes Radar radar anti-jamming Receivers reinforcement learning Signal to noise ratio Spaceborne radar Strategy Time-frequency analysis
Online Access	Get full text

Cover

Loading…

More Information
Summary:	In modern electronic warfare, it is becoming very important to develop intelligent and adaptive radar anti-jamming methods since jammers can now launch increasingly complex and unpredictable attacks. Besides, in practice, the jamming strategy is usually unknown to the radar. To overcome the limitations caused by the lack of information about the jammer, reinforcement learning is applied to radar anti-jamming in this paper via the adaptation of frequency hopping interval. In reinforcement learning, the sequential decision problem to solve is described as a Markov Decision Process (MDP). To describe the sequential radar anti-jamming decision making process, a detailed radar anti-jamming MDP model is formulated. To balance between integration efficiency and probability of interception, a flexible adjustable tradeoff between them is devised by defining the reward function of the MDP as the weighted sum of the integration efficiency factor and the probability of interception factor. Two properties of the MDP value function are proved. These properties are used to derive the optimal frequency hopping time interval for different pulse widths under the RL framework. Simulation results show that the proposed radar anti-jamming strategy can adapt to the jamming environment well and can control its performance flexibly by adjusting the weights of integration efficiency and probability of interception.
ISSN:	0018-9545 1939-9359
DOI:	10.1109/TVT.2022.3197425