A Smart Flight Controller based on Reinforcement Learning for Unmanned Aerial Vehicle (UAV)
Traditional flight controllers consist of Proportional Integral Derivates (PID), that although have dominant stability control but required high human interventions. In this study, a smart flight controller is developed for controlling UAVs which produces operator less mechanisms for flight controll...
Saved in:
Published in | 2021 IEEE International Conference on Signal and Image Processing Applications (ICSIPA) pp. 203 - 208 |
---|---|
Main Authors | , , , , , |
Format | Conference Proceeding |
Language | English |
Published |
IEEE
13.09.2021
|
Subjects | |
Online Access | Get full text |
Cover
Loading…
Summary: | Traditional flight controllers consist of Proportional Integral Derivates (PID), that although have dominant stability control but required high human interventions. In this study, a smart flight controller is developed for controlling UAVs which produces operator less mechanisms for flight controllers. It uses a neural network that has been trained using reinforcement learning techniques. Engineered with a variety of actuators (pitch, yaw, roll, and speed), the next-generation flight controller is directly trained to control its own decisions in flight. It also optimizes learning algorithms different from the traditional Actor and Critic networks. The agent gets state information from the environment and calculates the reward function depending on the sensors data from the environment. The agent then receives the observations to identify the state and reward functions and the agent activates the algorithm to perform actions. It shows the performance of a trained neural network consisting of a reward function in both simulation and real-time UAV control. Experimental results show that it can respond with relative precision. Using the same framework shows that UAVs can reliably hover in the air, even under adverse initialization conditions with obstacles. Reward functions computed during the flight for 2500, 5000, 7500 and 10000 episodes between the normalized values 0 and −4000. The computation time observed during each episode is 15 micro sec. |
---|---|
ISSN: | 2642-6471 |
DOI: | 10.1109/ICSIPA52582.2021.9576806 |