Perception-Aware Based UAV Trajectory Planner via Generative Adversarial Self-Imitation Learning From Demonstrations

The use of unmanned aerial vehicles (UAVs) for Internet of Things applications, like intelligent monitoring and search, is increasingly becoming a popular research focus globally. While various optimization algorithms exist to plan UAV flight paths, they frequently compromise the quality of the plan...

Full description

Saved in:

Bibliographic Details
Published in	IEEE internet of things journal p. 1
Main Authors	Zhang, Hanxuan, Huo, Ju, Huang, Yulong, Cheng, Jiajun, Li, Xiaofeng
Format	Journal Article
Language	English
Published	IEEE 08.10.2024
Subjects	Autonomous aerial vehicles class-level instance-balancing Costs generative adversarial self-imitation learning from demonstration Imitation learning Internet of Things perception-aware based trajectory planning Planning progressively growing discriminator Real-time systems Robustness Training Trajectory Trajectory planning Unmanned aerial vehicle
Online Access	Get full text

Cover

Loading…

More Information
Summary:	The use of unmanned aerial vehicles (UAVs) for Internet of Things applications, like intelligent monitoring and search, is increasingly becoming a popular research focus globally. While various optimization algorithms exist to plan UAV flight paths, they frequently compromise the quality of the planning path to decrease planning time. In view of the above problems, a perception-aware based UAV trajectory planner via generative adversarial self-imitation learning from demonstration is proposed. Firstly, a progressively growing discriminator is devised to prevent the policy network from being overpowered in early training stages, avoiding potential training failures. Secondly, the issue of homogenized strategic patterns among optimized expert trajectories is solved by incorporating successful trajectories from the policy network into the expert buffer, which thereby enhances the network's generalization capabilities. Thirdly, to address the challenges of skewed distribution and considerable performance variation among the strategies learned by the policy network during training, a class-level instance-balancing expert buffer is introduced. Finally, the yaw angle of the UAV in real time during flight is obtained by using the analytical solution of the position trajectory and yaw angle and the position trajectory output from the policy network. Experiments confirm our proposed method achieves comparable flight costs and success rates to those of the reference expert method, while the planning time is reduced. The proposed method is also shown to be well adapted to dynamic environments and obstacle trajectories, which are not involved in training. Additionally, the ablation studies highlight the individual contributions of each component within the proposed method.
ISSN:	2327-4662 2327-4662
DOI:	10.1109/JIOT.2024.3477450