Model-Free Attitude Control of Spacecraft Based on PID-Guide TD3 Algorithm
This paper is devoted to model-free attitude control of rigid spacecraft in the presence of control torque saturation and external disturbances. Specifically, a model-free deep reinforcement learning (DRL) controller is proposed, which can learn continuously according to the feedback of the environm...
Saved in:
Published in | International Journal of Aerospace Engineering Vol. 2020; no. 2020; pp. 1 - 13 |
---|---|
Main Authors | , , , , |
Format | Journal Article |
Language | English |
Published |
Cairo, Egypt
Hindawi Publishing Corporation
30.12.2020
Hindawi John Wiley & Sons, Inc Hindawi Limited |
Subjects | |
Online Access | Get full text |
Cover
Loading…
Summary: | This paper is devoted to model-free attitude control of rigid spacecraft in the presence of control torque saturation and external disturbances. Specifically, a model-free deep reinforcement learning (DRL) controller is proposed, which can learn continuously according to the feedback of the environment and realize the high-precision attitude control of spacecraft without repeatedly adjusting the controller parameters. Considering the continuity of state space and action space, the Twin Delayed Deep Deterministic Policy Gradient (TD3) algorithm based on actor-critic architecture is adopted. Compared with the Deep Deterministic Policy Gradient (DDPG) algorithm, TD3 has better performance. TD3 obtains the optimal policy by interacting with the environment without using any prior knowledge, so the learning process is time-consuming. Aiming at this problem, the PID-Guide TD3 algorithm is proposed, which can speed up the training speed and improve the convergence precision of the TD3 algorithm. Aiming at the problem that reinforcement learning (RL) is difficult to deploy in the actual environment, the pretraining/fine-tuning method is proposed for deployment, which can not only save training time and computing resources but also achieve good results quickly. The experimental results show that DRL controller can realize high-precision attitude stabilization and attitude tracking control, with fast response speed and small overshoot. The proposed PID-Guide TD3 algorithm has faster training speed and higher stability than the TD3 algorithm. |
---|---|
ISSN: | 1687-5966 1687-5974 |
DOI: | 10.1155/2020/8874619 |