Model-Free Attitude Control of Spacecraft Based on PID-Guide TD3 Algorithm

This paper is devoted to model-free attitude control of rigid spacecraft in the presence of control torque saturation and external disturbances. Specifically, a model-free deep reinforcement learning (DRL) controller is proposed, which can learn continuously according to the feedback of the environm...

Full description

Saved in:

Bibliographic Details
Published in	International Journal of Aerospace Engineering Vol. 2020; no. 2020; pp. 1 - 13
Main Authors	Man, WanXin, An, Jiping, Li, Xinhong, Zhang, Zhibin, Zhang, GuoHui
Format	Journal Article
Language	English
Published	Cairo, Egypt Hindawi Publishing Corporation 30.12.2020 Hindawi John Wiley & Sons, Inc Hindawi Limited
Subjects	Aerospace engineering Algorithms Attitude stability Comparative analysis Control algorithms Control theory Controllers Design Machine learning Neural networks Noise Space ships Space vehicles Spacecraft Spacecraft attitude control Spacecraft-environment interaction Tracking control Training Velocity
Online Access	Get full text

Cover

Loading…

More Information
Summary:	This paper is devoted to model-free attitude control of rigid spacecraft in the presence of control torque saturation and external disturbances. Specifically, a model-free deep reinforcement learning (DRL) controller is proposed, which can learn continuously according to the feedback of the environment and realize the high-precision attitude control of spacecraft without repeatedly adjusting the controller parameters. Considering the continuity of state space and action space, the Twin Delayed Deep Deterministic Policy Gradient (TD3) algorithm based on actor-critic architecture is adopted. Compared with the Deep Deterministic Policy Gradient (DDPG) algorithm, TD3 has better performance. TD3 obtains the optimal policy by interacting with the environment without using any prior knowledge, so the learning process is time-consuming. Aiming at this problem, the PID-Guide TD3 algorithm is proposed, which can speed up the training speed and improve the convergence precision of the TD3 algorithm. Aiming at the problem that reinforcement learning (RL) is difficult to deploy in the actual environment, the pretraining/fine-tuning method is proposed for deployment, which can not only save training time and computing resources but also achieve good results quickly. The experimental results show that DRL controller can realize high-precision attitude stabilization and attitude tracking control, with fast response speed and small overshoot. The proposed PID-Guide TD3 algorithm has faster training speed and higher stability than the TD3 algorithm.
ISSN:	1687-5966 1687-5974
DOI:	10.1155/2020/8874619