Optimal Dynamic State‐Dependent Maintenance Policy by Deep Reinforcement Learning
In this paper, we propose a new maintenance strategy considering “do nothing”, “imperfect repair”, and “replace” as alternative actions on a deteriorating system. The system is subject to random shocks that accelerate degradation. Unlike most existing works regarding maintenance with imperfect repai...
Saved in:
Published in | Quality and reliability engineering international Vol. 41; no. 6; pp. 2715 - 2728 |
---|---|
Main Authors | , , , |
Format | Journal Article |
Language | English |
Published |
Bognor Regis
Wiley Subscription Services, Inc
01.10.2025
|
Subjects | |
Online Access | Get full text |
ISSN | 0748-8017 1099-1638 |
DOI | 10.1002/qre.3806 |
Cover
Loading…
Abstract | In this paper, we propose a new maintenance strategy considering “do nothing”, “imperfect repair”, and “replace” as alternative actions on a deteriorating system. The system is subject to random shocks that accelerate degradation. Unlike most existing works regarding maintenance with imperfect repair actions, we propose a dynamic improvement factor that changes according to the state of the system at maintenance time. The proposed improvement factor is considered to have a random rejuvenating effect on the system, which reduces its degradation level (state) by reducing age. Such degradation state‐dependent improvement factor is more realistic than a fixed or random one, since the amount of improvement (rejuvenation) and the cost associated with maintenance are proportional to the system needs as described by the degradation levels. A Markov decision process is formulated to model the maintenance problem with a continuous state space and a Deep Reinforcement Learning algorithm is used to optimize the maintenance policy where the decision maker is trained by a Deep Q‐network. Central to this study is the comparison of three distinct models: a state‐independent improvement factor (Model I) versus two state‐dependent ones (Models II and III) with deterministic and stochastic repair effects, respectively. Through numerical and illustrative examples, this comparison underscores the importance of selecting the appropriate model when system condition data are available, demonstrating that state‐dependent models outperform their state‐independent counterparts in terms of cost‐efficiency and effectiveness. A sensitivity analysis is also conducted to examine the influence of the model's parameters on model selection. |
---|---|
AbstractList | In this paper, we propose a new maintenance strategy considering “do nothing”, “imperfect repair”, and “replace” as alternative actions on a deteriorating system. The system is subject to random shocks that accelerate degradation. Unlike most existing works regarding maintenance with imperfect repair actions, we propose a dynamic improvement factor that changes according to the state of the system at maintenance time. The proposed improvement factor is considered to have a random rejuvenating effect on the system, which reduces its degradation level (state) by reducing age. Such degradation state‐dependent improvement factor is more realistic than a fixed or random one, since the amount of improvement (rejuvenation) and the cost associated with maintenance are proportional to the system needs as described by the degradation levels. A Markov decision process is formulated to model the maintenance problem with a continuous state space and a Deep Reinforcement Learning algorithm is used to optimize the maintenance policy where the decision maker is trained by a Deep Q‐network. Central to this study is the comparison of three distinct models: a state‐independent improvement factor (Model I) versus two state‐dependent ones (Models II and III) with deterministic and stochastic repair effects, respectively. Through numerical and illustrative examples, this comparison underscores the importance of selecting the appropriate model when system condition data are available, demonstrating that state‐dependent models outperform their state‐independent counterparts in terms of cost‐efficiency and effectiveness. A sensitivity analysis is also conducted to examine the influence of the model's parameters on model selection. In this paper, we propose a new maintenance strategy considering “do nothing”, “imperfect repair”, and “replace” as alternative actions on a deteriorating system. The system is subject to random shocks that accelerate degradation. Unlike most existing works regarding maintenance with imperfect repair actions, we propose a dynamic improvement factor that changes according to the state of the system at maintenance time. The proposed improvement factor is considered to have a random rejuvenating effect on the system, which reduces its degradation level (state) by reducing age. Such degradation state‐dependent improvement factor is more realistic than a fixed or random one, since the amount of improvement (rejuvenation) and the cost associated with maintenance are proportional to the system needs as described by the degradation levels. A Markov decision process is formulated to model the maintenance problem with a continuous state space and a Deep Reinforcement Learning algorithm is used to optimize the maintenance policy where the decision maker is trained by a Deep Q‐network. Central to this study is the comparison of three distinct models: a state‐independent improvement factor (Model I) versus two state‐dependent ones (Models II and III) with deterministic and stochastic repair effects, respectively. Through numerical and illustrative examples, this comparison underscores the importance of selecting the appropriate model when system condition data are available, demonstrating that state‐dependent models outperform their state‐independent counterparts in terms of cost‐efficiency and effectiveness. A sensitivity analysis is also conducted to examine the influence of the model's parameters on model selection. |
Author | Eidi, Shaghayegh Haghighi, Firoozeh Safari, Abdollah Zio, Enrico |
Author_xml | – sequence: 1 givenname: Shaghayegh surname: Eidi fullname: Eidi, Shaghayegh organization: School of Mathematics Statistics and Computer Science College of Science University of Tehran Tehran Iran – sequence: 2 givenname: Firoozeh orcidid: 0000-0003-1880-937X surname: Haghighi fullname: Haghighi, Firoozeh organization: School of Mathematics Statistics and Computer Science College of Science University of Tehran Tehran Iran – sequence: 3 givenname: Abdollah surname: Safari fullname: Safari, Abdollah organization: School of Mathematics Statistics and Computer Science College of Science University of Tehran Tehran Iran – sequence: 4 givenname: Enrico surname: Zio fullname: Zio, Enrico organization: Center for Research on Risks and Crises (CRC) Mines Paris‐PSL University Paris France, Energy Department Politecnico di Milano Milan Italy |
BookMark | eNotkEtOwzAURS1UJNqCxBIsMWGS8vxJYg9Ry08qKqIwthznBaVqndRxB5mxBNbISkhVRndydK_umZCRbzwScs1gxgD43T7gTCjIzsiYgdYJy4QakTHkUiUKWH5BJl23ARhgrcZkvWpjvbNbuui93dWOrqON-Pv9s8AWfYk-0ldb-4jeeof0rdnWrqdFTxeILX3H2ldNcLg7gku0wdf-65KcV3bb4dV_Tsnn48PH_DlZrp5e5vfLxHGmYuJ0oTVK1LwSmYBCS-GUlUrarJKVy9MCQOTcArdW8jxVriwcQlnqSivFczElN6feNjT7A3bRbJpD8MOkEVymnIlcyoG6PVEuNF0XsDJtGB6H3jAwR2VmUGaOysQfUJthLQ |
Cites_doi | 10.1016/S0951-8320(03)00173-X 10.1080/05695557908974463 10.1016/j.ress.2021.107592 10.1016/j.ress.2018.05.002 10.1109/TR.2022.3197322 10.1016/j.ress.2017.05.004 10.1016/j.ress.2021.107905 10.1016/j.renene.2021.11.052 10.1016/j.ress.2014.08.011 10.1016/j.ejor.2015.02.050 10.1016/j.cie.2021.107298 10.1109/TR.2011.2167779 10.1016/j.ress.2020.106994 10.1016/j.cie.2016.10.008 10.1016/j.ress.2017.03.015 10.1080/08982112.2021.1977950 10.1016/0377-2217(92)90309-W 10.1016/j.ejor.2018.12.029 10.1002/qre.1431 10.1016/j.ress.2022.108613 |
ContentType | Journal Article |
Copyright | 2025 John Wiley & Sons Ltd. |
Copyright_xml | – notice: 2025 John Wiley & Sons Ltd. |
DBID | AAYXX CITATION 7TB 8FD FR3 |
DOI | 10.1002/qre.3806 |
DatabaseName | CrossRef Mechanical & Transportation Engineering Abstracts Technology Research Database Engineering Research Database |
DatabaseTitle | CrossRef Technology Research Database Mechanical & Transportation Engineering Abstracts Engineering Research Database |
DatabaseTitleList | CrossRef Technology Research Database |
DeliveryMethod | fulltext_linktorsrc |
Discipline | Engineering |
EISSN | 1099-1638 |
EndPage | 2728 |
ExternalDocumentID | 10_1002_qre_3806 |
GroupedDBID | .3N .GA .Y3 05W 0R~ 10A 123 1L6 1OB 1OC 31~ 33P 3SF 3WU 4.4 50Y 50Z 51W 51X 52M 52N 52O 52P 52S 52T 52U 52W 52X 5VS 66C 702 7PT 8-0 8-1 8-3 8-4 8-5 8UM 8WZ 930 A03 A6W AAESR AAEVG AAHQN AAMMB AAMNL AANHP AANLZ AAONW AASGY AAXRX AAYCA AAYXX AAZKR ABCQN ABCUV ABEML ABIJN ABJNI ABPVW ACAHQ ACBWZ ACCZN ACGFS ACIWK ACPOU ACRPL ACSCC ACXBN ACXQS ACYXJ ADBBV ADEOM ADIZJ ADMGS ADMLS ADNMO ADOZA ADXAS AEFGJ AEIGN AEIMD AENEX AEUYR AEYWJ AFBPY AFFNX AFFPM AFGKR AFWVQ AFZJQ AGHNM AGQPQ AGXDD AGYGG AHBTC AIDQK AIDYY AITYG AIURR AJXKR ALAGY ALMA_UNASSIGNED_HOLDINGS ALUQN ALVPJ AMBMR AMYDB ASPBG ATUGU AUFTA AVWKF AZBYB AZFZN AZVAB BAFTC BDRZF BFHJK BHBCM BMNLL BMXJE BNHUX BROTX BRXPI BY8 CITATION CMOOK CS3 D-E D-F DCZOG DPXWK DR2 DRFUL DRSTM DU5 EBS EJD F00 F01 F04 FEDTE G-S G.N GNP GODZA H.T H.X HBH HF~ HGLYW HHY HVGLF HZ~ IX1 J0M JPC KQQ LATKE LAW LC2 LC3 LEEKS LH4 LITHE LOXES LP6 LP7 LUTES LW6 LYRES M59 MEWTI MK4 MRFUL MRSTM MSFUL MSSTM MXFUL MXSTM N04 N05 N9A NF~ NNB O66 O9- P2P P2W P2X P4D PALCI Q.N Q11 QB0 QRW R.K RIWAO RJQFR RNS ROL RX1 RYL SAMSI SUPJJ TN5 UB1 V2E W8V W99 WBKPD WH7 WIH WIK WLBEL WOHZO WQJ WXSBR WYISQ XG1 XPP XV2 ZZTAW ~IA ~WT 7TB 8FD FR3 |
ID | FETCH-LOGICAL-c218t-c9b99e4e92f3630b943c8a484a6f4fc75b00372a02aa42758cdbce0dd9f988273 |
ISSN | 0748-8017 |
IngestDate | Sun Aug 31 08:18:03 EDT 2025 Wed Sep 03 16:44:44 EDT 2025 |
IsPeerReviewed | true |
IsScholarly | true |
Issue | 6 |
Language | English |
LinkModel | OpenURL |
MergedId | FETCHMERGED-LOGICAL-c218t-c9b99e4e92f3630b943c8a484a6f4fc75b00372a02aa42758cdbce0dd9f988273 |
Notes | ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14 |
ORCID | 0000-0003-1880-937X |
PQID | 3245213744 |
PQPubID | 1016437 |
PageCount | 14 |
ParticipantIDs | proquest_journals_3245213744 crossref_primary_10_1002_qre_3806 |
PublicationCentury | 2000 |
PublicationDate | 20251001 |
PublicationDateYYYYMMDD | 2025-10-01 |
PublicationDate_xml | – month: 10 year: 2025 text: 20251001 day: 01 |
PublicationDecade | 2020 |
PublicationPlace | Bognor Regis |
PublicationPlace_xml | – name: Bognor Regis |
PublicationTitle | Quality and reliability engineering international |
PublicationYear | 2025 |
Publisher | Wiley Subscription Services, Inc |
Publisher_xml | – name: Wiley Subscription Services, Inc |
References | Zhang N. (e_1_2_8_17_1) 2020; 34 e_1_2_8_18_1 e_1_2_8_19_1 e_1_2_8_13_1 e_1_2_8_14_1 e_1_2_8_15_1 e_1_2_8_16_1 e_1_2_8_3_1 e_1_2_8_2_1 e_1_2_8_5_1 e_1_2_8_4_1 e_1_2_8_7_1 e_1_2_8_6_1 e_1_2_8_9_1 e_1_2_8_8_1 e_1_2_8_20_1 e_1_2_8_10_1 e_1_2_8_11_1 e_1_2_8_22_1 e_1_2_8_12_1 e_1_2_8_23_1 Meeker W. (e_1_2_8_21_1) 2022 |
References_xml | – ident: e_1_2_8_4_1 doi: 10.1016/S0951-8320(03)00173-X – ident: e_1_2_8_3_1 doi: 10.1080/05695557908974463 – ident: e_1_2_8_7_1 doi: 10.1016/j.ress.2021.107592 – ident: e_1_2_8_8_1 doi: 10.1016/j.ress.2018.05.002 – ident: e_1_2_8_20_1 doi: 10.1109/TR.2022.3197322 – ident: e_1_2_8_23_1 doi: 10.1016/j.ress.2017.05.004 – ident: e_1_2_8_6_1 doi: 10.1016/j.ress.2021.107905 – ident: e_1_2_8_18_1 doi: 10.1016/j.renene.2021.11.052 – ident: e_1_2_8_14_1 doi: 10.1016/j.ress.2014.08.011 – ident: e_1_2_8_12_1 doi: 10.1016/j.ejor.2015.02.050 – ident: e_1_2_8_22_1 doi: 10.1016/j.cie.2021.107298 – ident: e_1_2_8_13_1 doi: 10.1109/TR.2011.2167779 – volume: 34 start-page: 16 issue: 1 year: 2020 ident: e_1_2_8_17_1 article-title: Deep Reinforcement Learning for Condition‐Based Maintenance Planning of Multi‐Component Systems Under Dependent Competing Risks publication-title: Reliability Engineering and System Safety – ident: e_1_2_8_11_1 doi: 10.1016/j.ress.2020.106994 – ident: e_1_2_8_10_1 doi: 10.1016/j.cie.2016.10.008 – ident: e_1_2_8_5_1 doi: 10.1016/j.ress.2017.03.015 – ident: e_1_2_8_16_1 doi: 10.1080/08982112.2021.1977950 – ident: e_1_2_8_2_1 doi: 10.1016/0377-2217(92)90309-W – ident: e_1_2_8_9_1 doi: 10.1016/j.ejor.2018.12.029 – ident: e_1_2_8_15_1 doi: 10.1002/qre.1431 – volume-title: Statistical Methods for Reliability Data year: 2022 ident: e_1_2_8_21_1 – ident: e_1_2_8_19_1 doi: 10.1016/j.ress.2022.108613 |
SSID | ssj0010098 |
Score | 2.3933525 |
Snippet | In this paper, we propose a new maintenance strategy considering “do nothing”, “imperfect repair”, and “replace” as alternative actions on a deteriorating... |
SourceID | proquest crossref |
SourceType | Aggregation Database Index Database |
StartPage | 2715 |
SubjectTerms | Deep learning Degradation Machine learning Maintenance Markov processes Optimization Repair Sensitivity analysis |
Title | Optimal Dynamic State‐Dependent Maintenance Policy by Deep Reinforcement Learning |
URI | https://www.proquest.com/docview/3245213744 |
Volume | 41 |
hasFullText | 1 |
inHoldings | 1 |
isFullTextHit | |
isPrint | |
link | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwnZ3NbtNAEMdXoVzggPgUhYIWiRtysXfXsX2sSKIKlVYiiRRxsfbLSSRIS3AP6YlH4DV4LZ6EmV1744gKFS5W5JWsaOe_M7Ormd8S8lppK5Tts4hzbSKRGB4pFWeRivPKmsIqnWDv8IfT_vFUvJ-ls17vZ6dq6bJWh_rq2r6S_7EqvAO7YpfsP1g2fBRewG-wLzzBwvC8kY3PYL1_Qb_lr5X3mWMoXxg099vWWEiBlequO8CDgDHrHFh7AdPr0KnanRK2tNV5N2X1lA2PaVrbz0sP9t68sVuSoYNOhHPFkKQvjSsVGC_kfCE3dr7Yurs5YpLd6GgJufuVDWNjWUnf_X6kDIo0jHxaumPd4Qpc93n3tIKloe7tpj6x4_8ykWMA9fHYev-MQFFMIbsO3JOzGqHueOPMt4o2kZ1lvg_9j6jhKbRf1_aQ5_E1YO7Ts3I0PTkpJ8PZ5Ba5zWBHgpdlDD4GUlmCXFZPfPX_ueUcx-xt-93dzGc38LtsZnKf3Gu2IfTIa-oB6dnVQ3K3A6d8RMaNumijLurU9ev7j6Ar2tEV9bqiakNRV3RHV7TV1WMyHQ0n746j5gqOSEPuV0e6UEVhhS1Yxfs8VoXgOpciF7JfiUpnKUaFjMmYSSkY7D21gdUfG1NUBezdMv6E7K3OV_YpoTluBfLExmkqhc6E1BKcf2FNzA1C7fbJq3Z6ygtPWik9U5uVMIUlTuE-OWjnrWzW4beSY_FAwjMhnv19-Dm5s5XkAdmr15f2BaSUtXrpjPkbpkZ97Q |
linkProvider | Wiley-Blackwell |
openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Optimal+Dynamic+State%E2%80%90Dependent+Maintenance+Policy+by+Deep+Reinforcement+Learning&rft.jtitle=Quality+and+reliability+engineering+international&rft.au=Eidi%2C+Shaghayegh&rft.au=Haghighi%2C+Firoozeh&rft.au=Safari%2C+Abdollah&rft.au=Zio%2C+Enrico&rft.date=2025-10-01&rft.pub=Wiley+Subscription+Services%2C+Inc&rft.issn=0748-8017&rft.eissn=1099-1638&rft.volume=41&rft.issue=6&rft.spage=2715&rft.epage=2728&rft_id=info:doi/10.1002%2Fqre.3806&rft.externalDBID=NO_FULL_TEXT |
thumbnail_l | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=0748-8017&client=summon |
thumbnail_m | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=0748-8017&client=summon |
thumbnail_s | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=0748-8017&client=summon |