Deep Reinforcement Learning for Optimal Replenishment in Stochastic Assembly Systems

This study presents a reinforcement learning–based approach to optimize replenishment policies in the presence of uncertainty, with the objective of minimizing total costs, including inventory holding, shortage, and ordering costs. The focus is on single-level assembly systems, where both component...

Full description

Saved in:
Bibliographic Details
Published inMathematics (Basel) Vol. 13; no. 14; p. 2229
Main Authors Sid Ahmed Abdellahi, Lativa, Zoubeir, Zeinebou, Mohamed, Yahya, Haouba, Ahmedou, Hmetty, Sidi
Format Journal Article
LanguageEnglish
Published Basel MDPI AG 01.07.2025
Subjects
Online AccessGet full text
ISSN2227-7390
2227-7390
DOI10.3390/math13142229

Cover

Abstract This study presents a reinforcement learning–based approach to optimize replenishment policies in the presence of uncertainty, with the objective of minimizing total costs, including inventory holding, shortage, and ordering costs. The focus is on single-level assembly systems, where both component delivery lead times and finished product demand are subject to randomness. The problem is formulated as a Markov decision process (MDP), in which an agent determines optimal order quantities for each component by accounting for stochastic lead times and demand variability. The Deep Q-Network (DQN) algorithm is adapted and employed to learn optimal replenishment policies over a fixed planning horizon. To enhance learning performance, we develop a tailored simulation environment that captures multi-component interactions, random lead times, and variable demand, along with a modular and realistic cost structure. The environment enables dynamic state transitions, lead time sampling, and flexible order reception modeling, providing a high-fidelity training ground for the agent. To further improve convergence and policy quality, we incorporate local search mechanisms and multiple action space discretizations per component. Simulation results show that the proposed method converges to stable ordering policies after approximately 100 episodes. The agent achieves an average service level of 96.93%, and stockout events are reduced by over 100% relative to early training phases. The system maintains component inventories within operationally feasible ranges, and cost components—holding, shortage, and ordering—are consistently minimized across 500 training episodes. These findings highlight the potential of deep reinforcement learning as a data-driven and adaptive approach to inventory management in complex and uncertain supply chains.
AbstractList This study presents a reinforcement learning–based approach to optimize replenishment policies in the presence of uncertainty, with the objective of minimizing total costs, including inventory holding, shortage, and ordering costs. The focus is on single-level assembly systems, where both component delivery lead times and finished product demand are subject to randomness. The problem is formulated as a Markov decision process (MDP), in which an agent determines optimal order quantities for each component by accounting for stochastic lead times and demand variability. The Deep Q-Network (DQN) algorithm is adapted and employed to learn optimal replenishment policies over a fixed planning horizon. To enhance learning performance, we develop a tailored simulation environment that captures multi-component interactions, random lead times, and variable demand, along with a modular and realistic cost structure. The environment enables dynamic state transitions, lead time sampling, and flexible order reception modeling, providing a high-fidelity training ground for the agent. To further improve convergence and policy quality, we incorporate local search mechanisms and multiple action space discretizations per component. Simulation results show that the proposed method converges to stable ordering policies after approximately 100 episodes. The agent achieves an average service level of 96.93%, and stockout events are reduced by over 100% relative to early training phases. The system maintains component inventories within operationally feasible ranges, and cost components—holding, shortage, and ordering—are consistently minimized across 500 training episodes. These findings highlight the potential of deep reinforcement learning as a data-driven and adaptive approach to inventory management in complex and uncertain supply chains.
Audience Academic
Author Mohamed, Yahya
Zoubeir, Zeinebou
Hmetty, Sidi
Haouba, Ahmedou
Sid Ahmed Abdellahi, Lativa
Author_xml – sequence: 1
  givenname: Lativa
  orcidid: 0009-0002-9128-0484
  surname: Sid Ahmed Abdellahi
  fullname: Sid Ahmed Abdellahi, Lativa
– sequence: 2
  givenname: Zeinebou
  surname: Zoubeir
  fullname: Zoubeir, Zeinebou
– sequence: 3
  givenname: Yahya
  orcidid: 0009-0003-4165-633X
  surname: Mohamed
  fullname: Mohamed, Yahya
– sequence: 4
  givenname: Ahmedou
  orcidid: 0009-0001-6228-768X
  surname: Haouba
  fullname: Haouba, Ahmedou
– sequence: 5
  givenname: Sidi
  orcidid: 0009-0006-1673-8853
  surname: Hmetty
  fullname: Hmetty, Sidi
BookMark eNpNUV1PHCEUJY0mWuubP2CSvnYtHzMwPG5sbU02Man6TBjmsstmBkbAh_33Xl3TCCSXHM49nNzzlZzEFIGQK0avhdD052zrjgnWcs71F3KORa0UPpx8up-Ry1L2FJdmom_1OXn8BbA0_yBEn7KDGWJtNmBzDHHbINTcLzXMdkLKMkEMZfdOCbF5qMntbKnBNetSYB6mQ_NwKBXm8o2cejsVuPyoF-Tp9vfjzd_V5v7P3c16s3JCdnUlRw7MMddC50GNTPWjVpy2VigmrGPcSu-lGsaeeQDOpO207gfdez9Az1pxQe6OumOye7NkNJoPJtlg3oGUt8ZmNDiBcSgirQfedb7VmumOSqYox68U1TCg1vej1pLT8wuUavbpJUe0bwQXeDjtGbKuj6ytRdG3odVsHe4R5uAwEB8QX-NodS-Z5Njw49jgciolg_9vk1Hzlpv5nJt4BWp0i94
Cites_doi 10.1016/j.cie.2006.01.005
10.24846/v22i3y201302
10.1080/00207543.2014.978030
10.1080/00207543.2022.2140221
10.1007/s11518-008-5072-z
10.1287/educ.2014.0128
10.1007/978-3-319-23350-5_4
10.1007/s10845-019-01531-7
10.1016/S0925-5273(00)00180-8
10.1016/j.ijpe.2008.11.015
10.1080/00207540903348346
10.1109/TCYB.2020.2977374
10.1007/978-1-4419-8939-0_7
10.1016/j.ijpe.2003.08.008
10.1080/00207543.2014.916429
10.3390/su12104075
10.1016/j.ijpe.2010.04.042
10.1016/j.ejor.2011.07.007
10.1109/MSP.2017.2743240
10.3390/math11010042
10.1016/j.ijpe.2008.06.005
10.1016/j.cor.2009.06.002
10.1080/00207543.2022.2104180
10.1016/j.engappai.2008.10.012
10.1016/j.dajour.2024.100508
10.1016/j.cor.2011.07.021
10.1016/j.ifacol.2015.06.090
10.1016/j.ijpe.2011.02.009
10.1080/0740817X.2015.1063793
10.1016/S0925-5273(02)00296-7
10.1007/978-3-030-49724-8_10
10.1287/opre.1080.0580
10.1109/WSC60868.2023.10407308
10.1016/j.arcontrol.2007.02.007
10.1016/j.ejor.2016.02.021
10.1007/s11590-016-1009-5
10.1016/j.ijpe.2017.11.012
10.1016/j.ejor.2010.08.007
10.1016/j.ejor.2018.07.014
10.3182/20130619-3-RU-3018.00335
ContentType Journal Article
Copyright COPYRIGHT 2025 MDPI AG
2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.
Copyright_xml – notice: COPYRIGHT 2025 MDPI AG
– notice: 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.
DBID AAYXX
CITATION
3V.
7SC
7TB
7XB
8AL
8FD
8FE
8FG
8FK
ABJCF
ABUWG
AFKRA
ARAPS
AZQEC
BENPR
BGLVJ
CCPQU
DWQXO
FR3
GNUQQ
HCIFZ
JQ2
K7-
KR7
L6V
L7M
L~C
L~D
M0N
M7S
P62
PHGZM
PHGZT
PIMPY
PKEHL
PQEST
PQGLB
PQQKQ
PQUKI
PRINS
PTHSS
Q9U
DOA
DOI 10.3390/math13142229
DatabaseName CrossRef
ProQuest Central (Corporate)
Computer and Information Systems Abstracts
Mechanical & Transportation Engineering Abstracts
ProQuest Central (purchase pre-March 2016)
Computing Database (Alumni Edition)
Technology Research Database
ProQuest SciTech Collection
ProQuest Technology Collection
ProQuest Central (Alumni) (purchase pre-March 2016)
Materials Science & Engineering Collection
ProQuest Central (Alumni)
ProQuest Central UK/Ireland
Advanced Technologies & Aerospace Collection
ProQuest Central Essentials - QC
ProQuest Central
Technology Collection
ProQuest One
ProQuest Central
Engineering Research Database
ProQuest Central Student
SciTech Premium Collection
ProQuest Computer Science Collection
Computer Science Database
Civil Engineering Abstracts
ProQuest Engineering Collection
Advanced Technologies Database with Aerospace
Computer and Information Systems Abstracts – Academic
Computer and Information Systems Abstracts Professional
Computing Database
Engineering Database
ProQuest Advanced Technologies & Aerospace Collection
ProQuest Central Premium
ProQuest One Academic (New)
Publicly Available Content Database
ProQuest One Academic Middle East (New)
ProQuest One Academic Eastern Edition (DO NOT USE)
ProQuest One Applied & Life Sciences
ProQuest One Academic
ProQuest One Academic UKI Edition
ProQuest Central China
Engineering Collection
ProQuest Central Basic
DOAJ Directory of Open Access Journal Collection
DatabaseTitle CrossRef
Publicly Available Content Database
Computer Science Database
ProQuest Central Student
Technology Collection
Technology Research Database
Computer and Information Systems Abstracts – Academic
ProQuest One Academic Middle East (New)
Mechanical & Transportation Engineering Abstracts
ProQuest Advanced Technologies & Aerospace Collection
ProQuest Central Essentials
ProQuest Computer Science Collection
Computer and Information Systems Abstracts
ProQuest Central (Alumni Edition)
SciTech Premium Collection
ProQuest One Community College
ProQuest Central China
ProQuest Central
ProQuest One Applied & Life Sciences
ProQuest Engineering Collection
ProQuest Central Korea
ProQuest Central (New)
Advanced Technologies Database with Aerospace
Engineering Collection
Advanced Technologies & Aerospace Collection
Civil Engineering Abstracts
ProQuest Computing
Engineering Database
ProQuest Central Basic
ProQuest Computing (Alumni Edition)
ProQuest One Academic Eastern Edition
ProQuest Technology Collection
ProQuest SciTech Collection
Computer and Information Systems Abstracts Professional
ProQuest One Academic UKI Edition
Materials Science & Engineering Collection
Engineering Research Database
ProQuest One Academic
ProQuest One Academic (New)
ProQuest Central (Alumni)
DatabaseTitleList

CrossRef
Publicly Available Content Database
Database_xml – sequence: 1
  dbid: DOA
  name: DOAJ Directory of Open Access Journals
  url: https://www.doaj.org/
  sourceTypes: Open Website
– sequence: 2
  dbid: 8FG
  name: ProQuest Technology Collection
  url: https://search.proquest.com/technologycollection1
  sourceTypes: Aggregation Database
DeliveryMethod fulltext_linktorsrc
Discipline Mathematics
EISSN 2227-7390
ExternalDocumentID oai_doaj_org_article_cd816afe255f499195061702371709eb
A849986162
10_3390_math13142229
GeographicLocations Germany
GeographicLocations_xml – name: Germany
GroupedDBID -~X
5VS
85S
8FE
8FG
AADQD
AAFWJ
AAYXX
ABDBF
ABJCF
ABPPZ
ABUWG
ACIPV
ACIWK
ADBBV
AFKRA
AFZYC
ALMA_UNASSIGNED_HOLDINGS
AMVHM
ARAPS
AZQEC
BCNDV
BENPR
BGLVJ
BPHCQ
CCPQU
CITATION
DWQXO
GNUQQ
GROUPED_DOAJ
HCIFZ
IAO
ITC
K6V
K7-
KQ8
L6V
M7S
MODMG
M~E
OK1
PHGZM
PHGZT
PIMPY
PQGLB
PQQKQ
PROAC
PTHSS
PUEGO
RNS
3V.
7SC
7TB
7XB
8AL
8FD
8FK
FR3
JQ2
KR7
L7M
L~C
L~D
M0N
P62
PKEHL
PQEST
PQUKI
PRINS
Q9U
ID FETCH-LOGICAL-c365t-6d2e1c1c4e5fe7d178d97204a3713ac12a6ff67bd81fee216a5998b98ffbe8143
IEDL.DBID DOA
ISSN 2227-7390
IngestDate Wed Aug 27 01:31:39 EDT 2025
Fri Jul 25 18:48:36 EDT 2025
Tue Aug 05 03:51:06 EDT 2025
Wed Sep 10 05:48:56 EDT 2025
IsDoiOpenAccess true
IsOpenAccess true
IsPeerReviewed true
IsScholarly true
Issue 14
Language English
License https://creativecommons.org/licenses/by/4.0
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-c365t-6d2e1c1c4e5fe7d178d97204a3713ac12a6ff67bd81fee216a5998b98ffbe8143
Notes ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
ORCID 0009-0006-1673-8853
0009-0001-6228-768X
0009-0003-4165-633X
0009-0002-9128-0484
OpenAccessLink https://doaj.org/article/cd816afe255f499195061702371709eb
PQID 3233232081
PQPubID 2032364
ParticipantIDs doaj_primary_oai_doaj_org_article_cd816afe255f499195061702371709eb
proquest_journals_3233232081
gale_infotracacademiconefile_A849986162
crossref_primary_10_3390_math13142229
PublicationCentury 2000
PublicationDate 2025-07-01
PublicationDateYYYYMMDD 2025-07-01
PublicationDate_xml – month: 07
  year: 2025
  text: 2025-07-01
  day: 01
PublicationDecade 2020
PublicationPlace Basel
PublicationPlace_xml – name: Basel
PublicationTitle Mathematics (Basel)
PublicationYear 2025
Publisher MDPI AG
Publisher_xml – name: MDPI AG
References Hnaien (ref_18) 2008; 17
Ammar (ref_12) 2015; 48
Rolf (ref_33) 2023; 61
Keswani (ref_44) 2024; 12
Lee (ref_4) 2006; 51
ref_36
ref_13
Hnaien (ref_26) 2010; 37
Sakiani (ref_2) 2012; 39
ref_32
Badhan (ref_43) 2022; 1
Hill (ref_1) 2018; 196
Dolgui (ref_16) 2004; 90
Tang (ref_23) 2003; 81
Levi (ref_28) 2008; 56
Lamouri (ref_35) 2020; 31
ref_39
Dolgui (ref_15) 2002; 78
ref_38
(ref_11) 2010; 48
Shapiro (ref_30) 2011; 209
Dolgui (ref_8) 2007; 31
Danilovic (ref_21) 2014; 52
Ji (ref_7) 2016; 253
Wazed (ref_9) 2009; 3
Powell (ref_31) 2019; 275
ref_47
Zhang (ref_3) 2011; 215
ref_45
Bushuev (ref_48) 2015; 38
ref_42
ref_41
ref_40
Louly (ref_17) 2008; 115
Ammar (ref_27) 2013; 46
Lisec (ref_46) 2017; 11
Louly (ref_20) 2013; 143
ref_29
Esteso (ref_37) 2023; 61
Chauhan (ref_22) 2009; 120
Louly (ref_19) 2011; 131
Karimi (ref_24) 2011; 38
Hamta (ref_10) 2015; 53
Dolgui (ref_14) 2013; 22
Hnaien (ref_25) 2009; 22
ref_5
Pan (ref_6) 2016; 48
Nguyen (ref_34) 2020; 50
References_xml – volume: 51
  start-page: 257
  year: 2006
  ident: ref_4
  article-title: A study on inventory replenishment policies in a two-echelon supply chain system
  publication-title: Comput. Ind. Eng.
  doi: 10.1016/j.cie.2006.01.005
– volume: 22
  start-page: 255
  year: 2013
  ident: ref_14
  article-title: A state of the art on supply planning and inventory control under lead time uncertainty
  publication-title: Stud. Inform. Control
  doi: 10.24846/v22i3y201302
– volume: 53
  start-page: 2970
  year: 2015
  ident: ref_10
  article-title: Supply chain network optimization considering assembly line balancing and demand uncertainty
  publication-title: Int. J. Prod. Res.
  doi: 10.1080/00207543.2014.978030
– volume: 38
  start-page: 13549
  year: 2011
  ident: ref_24
  article-title: A hybrid multi-objective genetic algorithm for planning order release date in two-level assembly system with random lead times
  publication-title: Expert Syst. Appl.
– volume: 61
  start-page: 7151
  year: 2023
  ident: ref_33
  article-title: A review on reinforcement learning algorithms and applications in supply chain management
  publication-title: Int. J. Prod. Res.
  doi: 10.1080/00207543.2022.2140221
– volume: 17
  start-page: 132
  year: 2008
  ident: ref_18
  article-title: Planned lead time optimization in material requirement planning environment for multilevel production systems
  publication-title: J. Syst. Sci. Syst. Eng.
  doi: 10.1007/s11518-008-5072-z
– ident: ref_29
  doi: 10.1287/educ.2014.0128
– ident: ref_13
  doi: 10.1007/978-3-319-23350-5_4
– volume: 1
  start-page: 61
  year: 2022
  ident: ref_43
  article-title: Enhancing Operational Efficiency: A Comprehensive Analysis of Machine Learning Integration in Industrial Automation
  publication-title: J. Bus. Insight Innov.
– volume: 31
  start-page: 1531
  year: 2020
  ident: ref_35
  article-title: Machine learning applied in production planning and control: A state-of-the-art in the era of industry 4.0
  publication-title: J. Intell. Manuf.
  doi: 10.1007/s10845-019-01531-7
– volume: 78
  start-page: 145
  year: 2002
  ident: ref_15
  article-title: A model for supply planning under lead time uncertainty
  publication-title: Int. J. Prod. Econ.
  doi: 10.1016/S0925-5273(00)00180-8
– volume: 120
  start-page: 411
  year: 2009
  ident: ref_22
  article-title: A continuous model for supply planning of assembly systems with stochastic component procurement times
  publication-title: Int. J. Prod. Econ.
  doi: 10.1016/j.ijpe.2008.11.015
– volume: 48
  start-page: 7463
  year: 2010
  ident: ref_11
  article-title: Safety stock or safety lead time: Coping with unreliability in demand and supply
  publication-title: Int. J. Prod. Res.
  doi: 10.1080/00207540903348346
– volume: 50
  start-page: 3826
  year: 2020
  ident: ref_34
  article-title: Deep reinforcement learning for multiagent systems: A review of challenges, solutions, and applications
  publication-title: IEEE Trans. Cybern.
  doi: 10.1109/TCYB.2020.2977374
– ident: ref_5
  doi: 10.1007/978-1-4419-8939-0_7
– ident: ref_40
– volume: 90
  start-page: 369
  year: 2004
  ident: ref_16
  article-title: The MPS parameterization under lead time uncertainty
  publication-title: Int. J. Prod. Econ.
  doi: 10.1016/j.ijpe.2003.08.008
– volume: 52
  start-page: 4007
  year: 2014
  ident: ref_21
  article-title: A novel relational approach for assembly system supply planning under environmental uncertainty
  publication-title: Int. J. Prod. Res.
  doi: 10.1080/00207543.2014.916429
– ident: ref_42
– ident: ref_45
  doi: 10.3390/su12104075
– volume: 131
  start-page: 76
  year: 2011
  ident: ref_19
  article-title: Optimal time phasing and periodicity for MRP with POQ policy
  publication-title: Int. J. Prod. Econ.
  doi: 10.1016/j.ijpe.2010.04.042
– volume: 215
  start-page: 590
  year: 2011
  ident: ref_3
  article-title: Collaborative production planning of supply chain under price and demand uncertainty
  publication-title: Eur. J. Oper. Res.
  doi: 10.1016/j.ejor.2011.07.007
– ident: ref_41
  doi: 10.1109/MSP.2017.2743240
– ident: ref_47
  doi: 10.3390/math11010042
– volume: 3
  start-page: 342
  year: 2009
  ident: ref_9
  article-title: Uncertainty factors in real manufacturing environment
  publication-title: Aust. J. Basic Appl. Sci.
– volume: 38
  start-page: 283
  year: 2015
  ident: ref_48
  article-title: A review of inventory lot sizing review papers
  publication-title: Manag. Res. Rev.
– volume: 115
  start-page: 236
  year: 2008
  ident: ref_17
  article-title: Supply planning for single-level assembly system with stochastic component delivery times and service-level constraint
  publication-title: Int. J. Prod. Econ.
  doi: 10.1016/j.ijpe.2008.06.005
– volume: 37
  start-page: 1835
  year: 2010
  ident: ref_26
  article-title: Multi-objective optimization for inventory control in two-level assembly systems under uncertainty of lead times
  publication-title: Comput. Oper. Res.
  doi: 10.1016/j.cor.2009.06.002
– volume: 61
  start-page: 5772
  year: 2023
  ident: ref_37
  article-title: Reinforcement learning applied to production planning and control
  publication-title: Int. J. Prod. Res.
  doi: 10.1080/00207543.2022.2104180
– volume: 22
  start-page: 906
  year: 2009
  ident: ref_25
  article-title: Genetic algorithm for supply planning in two-level assembly systems with random lead times
  publication-title: Eng. Appl. Artif. Intell.
  doi: 10.1016/j.engappai.2008.10.012
– volume: 12
  start-page: 100508
  year: 2024
  ident: ref_44
  article-title: A comparative analysis of metaheuristic algorithms in interval-valued sustainable economic production quantity inventory models using center-radius optimization
  publication-title: Decis. Anal. J.
  doi: 10.1016/j.dajour.2024.100508
– volume: 39
  start-page: 1325
  year: 2012
  ident: ref_2
  article-title: Multi-objective supply planning for two-level assembly systems with stochastic lead times
  publication-title: Comput. Oper. Res.
  doi: 10.1016/j.cor.2011.07.021
– volume: 48
  start-page: 254
  year: 2015
  ident: ref_12
  article-title: Supply planning for multi-levels assembly system under random lead times
  publication-title: IFAC-PapersOnLine
  doi: 10.1016/j.ifacol.2015.06.090
– volume: 143
  start-page: 35
  year: 2013
  ident: ref_20
  article-title: Optimal MRP parameters for a single item inventory with random replenishment lead time, POQ policy and service level constraint
  publication-title: Int. J. Prod. Econ.
  doi: 10.1016/j.ijpe.2011.02.009
– volume: 48
  start-page: 267
  year: 2016
  ident: ref_6
  article-title: Component procurement strategies in decentralized assembly systems under supply uncertainty
  publication-title: IIE Trans.
  doi: 10.1080/0740817X.2015.1063793
– volume: 81
  start-page: 415
  year: 2003
  ident: ref_23
  article-title: The detailed coordination problem in a two-level assembly system with stochastic lead times
  publication-title: Int. J. Prod. Econ.
  doi: 10.1016/S0925-5273(02)00296-7
– ident: ref_32
  doi: 10.1007/978-3-030-49724-8_10
– volume: 56
  start-page: 1184
  year: 2008
  ident: ref_28
  article-title: Approximation algorithms for capacitated stochastic inventory control models
  publication-title: Oper. Res.
  doi: 10.1287/opre.1080.0580
– ident: ref_38
– ident: ref_39
  doi: 10.1109/WSC60868.2023.10407308
– volume: 31
  start-page: 269
  year: 2007
  ident: ref_8
  article-title: Supply planning under uncertainties in MRP environments: A state of the art
  publication-title: Annu. Rev. Control
  doi: 10.1016/j.arcontrol.2007.02.007
– ident: ref_36
– volume: 253
  start-page: 383
  year: 2016
  ident: ref_7
  article-title: Optimal production planning for assembly systems with uncertain capacities and random demand
  publication-title: Eur. J. Oper. Res.
  doi: 10.1016/j.ejor.2016.02.021
– volume: 11
  start-page: 1137
  year: 2017
  ident: ref_46
  article-title: A metaheuristic approach to solving a multiproduct EOQ-based inventory problem with storage space constraints
  publication-title: Optim. Lett.
  doi: 10.1007/s11590-016-1009-5
– volume: 196
  start-page: 12
  year: 2018
  ident: ref_1
  article-title: Collaborative planning, forecasting, and replenishment & firm performance: An empirical evaluation
  publication-title: Int. J. Prod. Econ.
  doi: 10.1016/j.ijpe.2017.11.012
– volume: 209
  start-page: 63
  year: 2011
  ident: ref_30
  article-title: Analysis of stochastic dual dynamic programming method
  publication-title: Eur. J. Oper. Res.
  doi: 10.1016/j.ejor.2010.08.007
– volume: 275
  start-page: 795
  year: 2019
  ident: ref_31
  article-title: A unified framework for stochastic optimization
  publication-title: Eur. J. Oper. Res.
  doi: 10.1016/j.ejor.2018.07.014
– volume: 46
  start-page: 389
  year: 2013
  ident: ref_27
  article-title: Mathematical model for supply planning of multi-level assembly systems with stochastic lead times
  publication-title: IFAC Proc. Vol.
  doi: 10.3182/20130619-3-RU-3018.00335
SSID ssj0000913849
Score 2.2953284
Snippet This study presents a reinforcement learning–based approach to optimize replenishment policies in the presence of uncertainty, with the objective of minimizing...
SourceID doaj
proquest
gale
crossref
SourceType Open Website
Aggregation Database
Index Database
StartPage 2229
SubjectTerms Algorithms
Artificial intelligence
assembly system
Convergence (Social sciences)
Costs
Customer satisfaction
Decision making
Deep learning
deep reinforcement learning
Demand
Design
Fines & penalties
Inventory
Inventory control
Inventory management
Lead time
Linear programming
Logistics
Manufacturing
Markov analysis
Markov processes
Modular structures
Neural networks
Optimization
Order quantity
Policies
Production planning
Random variables
Randomness
Raw materials
Replenishment
replenishment planning
Shortages
stochastic demand
Suppliers
Supply chains
uncertain lead times
SummonAdditionalLinks – databaseName: ProQuest Technology Collection
  dbid: 8FG
  link: http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwfV1LbxMxELagXOCAKA-RUiofQJxWre1dr32qCm2okAoStFJvlh_jplKbpMly4N93xnECHEDak9daWzP2zHxj7zeMvfMhii5l2Ug4aJsWpGpM6kJDTHh9TCL1QHnIs6_69KL9ctld1oTbsl6rXNvEYqjTLFKOfF9JhY9ED3Y4v2uoahSdrtYSGg_ZI4Gehta5GX_e5FiI89K0dnXfXSG638cocCJUyXvYvzxRIez_l1kuvmb8jD2tQSI_Wml1mz2A6XP25GzDsLp8wc6PAeb8OxTi01hyfLxypV5xbOLf0Bbc4kcwwkbHcr2clC7XU_5jmMWJJ3pmTie-t-HmF6-85S_Zxfjk_NNpUyskNFHpbmh0kiCiiC10GfokepMsVZ3xCrGnj0J6nbPuQzIiA0ihfYfwKliTcwCDodIrtjWdTeE14_mglypZBGDGIuYDq3TKJijrwfcmhxF7v5aWm6-IMBwCCJKq-1OqI_aRRLnpQ_TVpWG2uHJ1N7iIE9I-A-KZjJCLStESMbzEafc4Ng72gRThSIbDwkdf_xXAqRJdlTtCrVqjhZYjtrvWlau7b-l-r5Wd_79-wx5Lqudbrt_usq1h8RPeYpAxhL2yku4B1YbRfw
  priority: 102
  providerName: ProQuest
Title Deep Reinforcement Learning for Optimal Replenishment in Stochastic Assembly Systems
URI https://www.proquest.com/docview/3233232081
https://doaj.org/article/cd816afe255f499195061702371709eb
Volume 13
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
link http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwrV1NT9wwEB1RuLQH1NJWXUpXPrTqKSK2E8c-QssWVQIqChI3yx_jLhIsiE0P_fcdOwEtB9RLpZwsK7LeJOP3nMkbgI_OB97GJCqBdVM1KGSlY-ur7ITXhchjh_kc8uhYHZ433y_ai5VWX7kmbLAHHoDbDVFz5RIS9U3EznPX0uwhLiTpkNqgz9m3NvWKmCo52HCpGzNUukvS9bvE_-ZclhMP82gPKlb9TyXkssvMXsLmSA_Z3rCsV7CGiy14cfTgrbp8DWdfEW_ZKRbL01BO99jokvqL0RA7oSxwTTchbk1byuVyXqZcLtjP_ibMXTZmZvlb77W_-sNGx_I3cD47OPtyWI29EaogVdtXKgrkgYcG24Rd5J2OJvebcYSLdIELp1JSnSf0EqIgCFsSVt7olDxqIklvYX1xs8B3wFLdCRkNSS9tSO2hkSom7aVx6Dqd_AQ-3aNlbwcLDEvSIaNqV1GdwH6G8mFONq4uAxROO4bT_iucE_icA2Ezhv2dC278S4CWmo2q7B5F1WjFlZjAzn2s7PjeLa0Uki5BPGf7f6zmPTwXud9vKc_dgfX-7jd-IBLS-yk807NvU9jYPzj-cTotT99f2z_bSw
linkProvider Directory of Open Access Journals
linkToHtml http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwtV1LbxMxEB6V9AAcEE8RKOADFadVY3vXax8QammrlDYBlVTqzXj9aCrRJCSLUP8Uv5GZzSbAAW6V9uS1bGs89sw3tr8BeO0qz4uQRCZiL8_yKGSmQ1FlxIRX-sBDGSkOORiq_ln-4bw434Cfq7cwdK1ytSc2G3WYeoqR70gh8RNowd7NvmWUNYpOV1cpNJZqcRyvfyBkW7w92sf53Rbi8GD0vp-1WQUyL1VRZyqIyD33eSxSLAMvdTCUqcVJxGvOc-FUSqqsguYpRsGVKxCSVEanVEWN7gW2ews2c3rR2oHNvYPhp9N1VIdYNnVuljfspTS9HfQ7x1w2kRbzl-1rUgT8yxA01u3wPtxr3VK2u9SjB7ARJw_h7mDN6bp4BKP9GGfsNDZUq76JKrKWnfWCYRH7iLvPFTaCPj2assvFuKlyOWGf66kfOyKEZnTGfFV9vWYtU_pjOLsR6T2BzmQ6iU-BpV4pZDAI-bRBlBmNVCHpShoXXalT1YXtlbTsbEm9YRGykFTtn1Ltwh6Jcl2HCLObgun8wrbrz3ockHIpIoJKCPIo-S1R0Qscdol9Y2dvaCIsybCeO-_a1wk4VCLIsrs4q0YrrkQXtlZzZdv1vrC_tfPZ_3-_gtv90eDEnhwNj5_DHUHZhJvLv1vQqeff4wt0cerqZatXDL7ctCr_AqyCEJE
linkToPdf http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwtV1LbxMxELbaIqFyQDxaESjgAxWnVWJ7148DQoUQWkoLglbqzXj9aCrRJCSLUP8av44ZZzfAAW6V9uS1bGs89sw3tr8h5JmrPatC4gWPg7IoIxeFDlVdIBOe8oEFFTEOeXQs90_Ld2fV2Rr52b2FwWuV3Z6YN-ow9Rgj7wsu4ONgwfqpvRbxcTh6OftWYAYpPGnt0mksVeQwXv0A-LZ4cTCEud7lfPTm5PV-0WYYKLyQVVPIwCPzzJexSlEFpnQwmLXFCcBuzjPuZEpS1UGzFCNn0lUAT2qjU6qjBlcD2l0nN5RQBoGfHr1dxXeQb1OXZnnXXggz6IMHOmYix1zMX1YwJwv4l0nIdm50h9xuHVS6t9Sou2QtTu6RW0crdtfFfXIyjHFGP8VMuupzfJG2PK3nFIroB9iHLqER8O7BqF0sxrnKxYR-bqZ-7JAamuJp82X99Yq2nOlb5PRaZLdNNibTSXxAaBooLoIB8KcN4M1ohAxJ18K46JROdY_sdtKysyUJhwXwglK1f0q1R16hKFd1kDo7F0zn57ZdidbDgKRLEbBUAriHaXCRlJ7DsBX0DZ09x4mwKMNm7rxr3ynAUJEqy-7BrBotmeQ9stPNlW1X_sL-1tOH___9lNwEBbbvD44PH5FNjmmF8y3gHbLRzL_Hx-DrNPWTrFSUfLluLf4FXPYTYQ
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Deep+Reinforcement+Learning+for+Optimal+Replenishment+in+Stochastic+Assembly+Systems&rft.jtitle=Mathematics+%28Basel%29&rft.au=Lativa+Sid+Ahmed+Abdellahi&rft.au=Zeinebou+Zoubeir&rft.au=Yahya+Mohamed&rft.au=Ahmedou+Haouba&rft.date=2025-07-01&rft.pub=MDPI+AG&rft.eissn=2227-7390&rft.volume=13&rft.issue=14&rft.spage=2229&rft_id=info:doi/10.3390%2Fmath13142229&rft.externalDBID=DOA&rft.externalDocID=oai_doaj_org_article_cd816afe255f499195061702371709eb
thumbnail_l http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=2227-7390&client=summon
thumbnail_m http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=2227-7390&client=summon
thumbnail_s http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=2227-7390&client=summon