Perception-Aware Based UAV Trajectory Planner via Generative Adversarial Self-Imitation Learning From Demonstrations
The use of unmanned aerial vehicles (UAVs) for Internet of Things applications, like intelligent monitoring and search, is increasingly becoming a popular research focus globally. While various optimization algorithms exist to plan UAV flight paths, they frequently compromise the quality of the plan...
Saved in:
Published in | IEEE internet of things journal p. 1 |
---|---|
Main Authors | , , , , |
Format | Journal Article |
Language | English |
Published |
IEEE
08.10.2024
|
Subjects | |
Online Access | Get full text |
Cover
Loading…
Summary: | The use of unmanned aerial vehicles (UAVs) for Internet of Things applications, like intelligent monitoring and search, is increasingly becoming a popular research focus globally. While various optimization algorithms exist to plan UAV flight paths, they frequently compromise the quality of the planning path to decrease planning time. In view of the above problems, a perception-aware based UAV trajectory planner via generative adversarial self-imitation learning from demonstration is proposed. Firstly, a progressively growing discriminator is devised to prevent the policy network from being overpowered in early training stages, avoiding potential training failures. Secondly, the issue of homogenized strategic patterns among optimized expert trajectories is solved by incorporating successful trajectories from the policy network into the expert buffer, which thereby enhances the network's generalization capabilities. Thirdly, to address the challenges of skewed distribution and considerable performance variation among the strategies learned by the policy network during training, a class-level instance-balancing expert buffer is introduced. Finally, the yaw angle of the UAV in real time during flight is obtained by using the analytical solution of the position trajectory and yaw angle and the position trajectory output from the policy network. Experiments confirm our proposed method achieves comparable flight costs and success rates to those of the reference expert method, while the planning time is reduced. The proposed method is also shown to be well adapted to dynamic environments and obstacle trajectories, which are not involved in training. Additionally, the ablation studies highlight the individual contributions of each component within the proposed method. |
---|---|
ISSN: | 2327-4662 2327-4662 |
DOI: | 10.1109/JIOT.2024.3477450 |