Optimal Dynamic State‐Dependent Maintenance Policy by Deep Reinforcement Learning

In this paper, we propose a new maintenance strategy considering “do nothing”, “imperfect repair”, and “replace” as alternative actions on a deteriorating system. The system is subject to random shocks that accelerate degradation. Unlike most existing works regarding maintenance with imperfect repai...

Full description

Saved in:

Bibliographic Details
Published in	Quality and reliability engineering international Vol. 41; no. 6; pp. 2715 - 2728
Main Authors	Eidi, Shaghayegh, Haghighi, Firoozeh, Safari, Abdollah, Zio, Enrico
Format	Journal Article
Language	English
Published	Bognor Regis Wiley Subscription Services, Inc 01.10.2025
Subjects	Deep learning Degradation Machine learning Maintenance Markov processes Optimization Repair Sensitivity analysis
Online Access	Get full text

Cover

Loading…

More Information
Summary:	In this paper, we propose a new maintenance strategy considering “do nothing”, “imperfect repair”, and “replace” as alternative actions on a deteriorating system. The system is subject to random shocks that accelerate degradation. Unlike most existing works regarding maintenance with imperfect repair actions, we propose a dynamic improvement factor that changes according to the state of the system at maintenance time. The proposed improvement factor is considered to have a random rejuvenating effect on the system, which reduces its degradation level (state) by reducing age. Such degradation state‐dependent improvement factor is more realistic than a fixed or random one, since the amount of improvement (rejuvenation) and the cost associated with maintenance are proportional to the system needs as described by the degradation levels. A Markov decision process is formulated to model the maintenance problem with a continuous state space and a Deep Reinforcement Learning algorithm is used to optimize the maintenance policy where the decision maker is trained by a Deep Q‐network. Central to this study is the comparison of three distinct models: a state‐independent improvement factor (Model I) versus two state‐dependent ones (Models II and III) with deterministic and stochastic repair effects, respectively. Through numerical and illustrative examples, this comparison underscores the importance of selecting the appropriate model when system condition data are available, demonstrating that state‐dependent models outperform their state‐independent counterparts in terms of cost‐efficiency and effectiveness. A sensitivity analysis is also conducted to examine the influence of the model's parameters on model selection.
Bibliography:	ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14
ISSN:	0748-8017 1099-1638
DOI:	10.1002/qre.3806