A dynamic multi-task selective execution policy considering stochastic dependence between degradation and random shocks by deep reinforcement learning
•A dynamic multi-task selective execution policy for multi-task missions.•The policy considers the stochastic dependence between degradation and shocks.•The dynamic decision-making model is constructed as a Markov decision process.•A deep reinforcement learning approach with action mask is tailored....
Saved in:
Published in | Reliability engineering & system safety Vol. 257; p. 110844 |
---|---|
Main Authors | , , , , |
Format | Journal Article |
Language | English |
Published |
Elsevier Ltd
01.05.2025
|
Subjects | |
Online Access | Get full text |
ISSN | 0951-8320 |
DOI | 10.1016/j.ress.2025.110844 |
Cover
Loading…
Summary: | •A dynamic multi-task selective execution policy for multi-task missions.•The policy considers the stochastic dependence between degradation and shocks.•The dynamic decision-making model is constructed as a Markov decision process.•A deep reinforcement learning approach with action mask is tailored.•The action mask technique is used to prevent the repeated selection of tasks.
To improve the efficiency of UAVs, it is common for a UAV to perform multiple tasks during each departure. However, existing mission abort policies primarily focus on scenarios where the system executes a single task and are not suitable to more complex multi-task missions. Moreover, in practice, degradation and random shocks often occur simultaneously, while existing studies typically only consider their separate effects on mission abort policies. To solve these problems, a multi-task selective execution policy considering the stochastic dependence between degradation and shocks is proposed to determine the next task for the system or the timing for mission abort. First, considering the health state of the system, location information, and the completion state of tasks, a multi-task selective execution policy is proposed. Next, to maximize the cumulative reward of the system, the corresponding sequential decision problem is formulated as a Markov Decision Process. Then, to address the dimensionality curse of continuous state space, a solution method based on deep reinforcement learning algorithms is tailored, incorporating an action masking technique to avoid repeated selection of already executed tasks. Finally, the effectiveness of the proposed method is verified through a numerical study using a UAV for multiple reconnaissance tasks. |
---|---|
ISSN: | 0951-8320 |
DOI: | 10.1016/j.ress.2025.110844 |