Model-Free Attitude Control of Spacecraft Based on PID-Guide TD3 Algorithm

This paper is devoted to model-free attitude control of rigid spacecraft in the presence of control torque saturation and external disturbances. Specifically, a model-free deep reinforcement learning (DRL) controller is proposed, which can learn continuously according to the feedback of the environm...

Full description

Saved in:
Bibliographic Details
Published inInternational Journal of Aerospace Engineering Vol. 2020; no. 2020; pp. 1 - 13
Main Authors Man, WanXin, An, Jiping, Li, Xinhong, Zhang, Zhibin, Zhang, GuoHui
Format Journal Article
LanguageEnglish
Published Cairo, Egypt Hindawi Publishing Corporation 30.12.2020
Hindawi
John Wiley & Sons, Inc
Hindawi Limited
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:This paper is devoted to model-free attitude control of rigid spacecraft in the presence of control torque saturation and external disturbances. Specifically, a model-free deep reinforcement learning (DRL) controller is proposed, which can learn continuously according to the feedback of the environment and realize the high-precision attitude control of spacecraft without repeatedly adjusting the controller parameters. Considering the continuity of state space and action space, the Twin Delayed Deep Deterministic Policy Gradient (TD3) algorithm based on actor-critic architecture is adopted. Compared with the Deep Deterministic Policy Gradient (DDPG) algorithm, TD3 has better performance. TD3 obtains the optimal policy by interacting with the environment without using any prior knowledge, so the learning process is time-consuming. Aiming at this problem, the PID-Guide TD3 algorithm is proposed, which can speed up the training speed and improve the convergence precision of the TD3 algorithm. Aiming at the problem that reinforcement learning (RL) is difficult to deploy in the actual environment, the pretraining/fine-tuning method is proposed for deployment, which can not only save training time and computing resources but also achieve good results quickly. The experimental results show that DRL controller can realize high-precision attitude stabilization and attitude tracking control, with fast response speed and small overshoot. The proposed PID-Guide TD3 algorithm has faster training speed and higher stability than the TD3 algorithm.
ISSN:1687-5966
1687-5974
DOI:10.1155/2020/8874619