Deep Reinforcement Learning for Optimal Replenishment in Stochastic Assembly Systems

This study presents a reinforcement learning–based approach to optimize replenishment policies in the presence of uncertainty, with the objective of minimizing total costs, including inventory holding, shortage, and ordering costs. The focus is on single-level assembly systems, where both component...

Full description

Saved in:

Bibliographic Details
Published in	Mathematics (Basel) Vol. 13; no. 14; p. 2229
Main Authors	Sid Ahmed Abdellahi, Lativa, Zoubeir, Zeinebou, Mohamed, Yahya, Haouba, Ahmedou, Hmetty, Sidi
Format	Journal Article
Language	English
Published	Basel MDPI AG 01.07.2025
Subjects	Algorithms Artificial intelligence assembly system Convergence (Social sciences) Costs Customer satisfaction Decision making Deep learning deep reinforcement learning Demand Design Fines & penalties Inventory Inventory control Inventory management Lead time Linear programming Logistics Manufacturing Markov analysis Markov processes Modular structures Neural networks Optimization Order quantity Policies Production planning Random variables Randomness Raw materials Replenishment replenishment planning Shortages stochastic demand Suppliers Supply chains uncertain lead times Germany
Online Access	Get full text
ISSN	2227-7390 2227-7390
DOI	10.3390/math13142229

Cover

Abstract	This study presents a reinforcement learning–based approach to optimize replenishment policies in the presence of uncertainty, with the objective of minimizing total costs, including inventory holding, shortage, and ordering costs. The focus is on single-level assembly systems, where both component delivery lead times and finished product demand are subject to randomness. The problem is formulated as a Markov decision process (MDP), in which an agent determines optimal order quantities for each component by accounting for stochastic lead times and demand variability. The Deep Q-Network (DQN) algorithm is adapted and employed to learn optimal replenishment policies over a fixed planning horizon. To enhance learning performance, we develop a tailored simulation environment that captures multi-component interactions, random lead times, and variable demand, along with a modular and realistic cost structure. The environment enables dynamic state transitions, lead time sampling, and flexible order reception modeling, providing a high-fidelity training ground for the agent. To further improve convergence and policy quality, we incorporate local search mechanisms and multiple action space discretizations per component. Simulation results show that the proposed method converges to stable ordering policies after approximately 100 episodes. The agent achieves an average service level of 96.93%, and stockout events are reduced by over 100% relative to early training phases. The system maintains component inventories within operationally feasible ranges, and cost components—holding, shortage, and ordering—are consistently minimized across 500 training episodes. These findings highlight the potential of deep reinforcement learning as a data-driven and adaptive approach to inventory management in complex and uncertain supply chains.
AbstractList	This study presents a reinforcement learning–based approach to optimize replenishment policies in the presence of uncertainty, with the objective of minimizing total costs, including inventory holding, shortage, and ordering costs. The focus is on single-level assembly systems, where both component delivery lead times and finished product demand are subject to randomness. The problem is formulated as a Markov decision process (MDP), in which an agent determines optimal order quantities for each component by accounting for stochastic lead times and demand variability. The Deep Q-Network (DQN) algorithm is adapted and employed to learn optimal replenishment policies over a fixed planning horizon. To enhance learning performance, we develop a tailored simulation environment that captures multi-component interactions, random lead times, and variable demand, along with a modular and realistic cost structure. The environment enables dynamic state transitions, lead time sampling, and flexible order reception modeling, providing a high-fidelity training ground for the agent. To further improve convergence and policy quality, we incorporate local search mechanisms and multiple action space discretizations per component. Simulation results show that the proposed method converges to stable ordering policies after approximately 100 episodes. The agent achieves an average service level of 96.93%, and stockout events are reduced by over 100% relative to early training phases. The system maintains component inventories within operationally feasible ranges, and cost components—holding, shortage, and ordering—are consistently minimized across 500 training episodes. These findings highlight the potential of deep reinforcement learning as a data-driven and adaptive approach to inventory management in complex and uncertain supply chains.
Audience	Academic
Author	Mohamed, Yahya Zoubeir, Zeinebou Hmetty, Sidi Haouba, Ahmedou Sid Ahmed Abdellahi, Lativa
Author_xml	– sequence: 1 givenname: Lativa orcidid: 0009-0002-9128-0484 surname: Sid Ahmed Abdellahi fullname: Sid Ahmed Abdellahi, Lativa – sequence: 2 givenname: Zeinebou surname: Zoubeir fullname: Zoubeir, Zeinebou – sequence: 3 givenname: Yahya orcidid: 0009-0003-4165-633X surname: Mohamed fullname: Mohamed, Yahya – sequence: 4 givenname: Ahmedou orcidid: 0009-0001-6228-768X surname: Haouba fullname: Haouba, Ahmedou – sequence: 5 givenname: Sidi orcidid: 0009-0006-1673-8853 surname: Hmetty fullname: Hmetty, Sidi
BookMark	eNpNUV1PHCEUJY0mWuubP2CSvnYtHzMwPG5sbU02Man6TBjmsstmBkbAh_33Xl3TCCSXHM49nNzzlZzEFIGQK0avhdD052zrjgnWcs71F3KORa0UPpx8up-Ry1L2FJdmom_1OXn8BbA0_yBEn7KDGWJtNmBzDHHbINTcLzXMdkLKMkEMZfdOCbF5qMntbKnBNetSYB6mQ_NwKBXm8o2cejsVuPyoF-Tp9vfjzd_V5v7P3c16s3JCdnUlRw7MMddC50GNTPWjVpy2VigmrGPcSu-lGsaeeQDOpO207gfdez9Az1pxQe6OumOye7NkNJoPJtlg3oGUt8ZmNDiBcSgirQfedb7VmumOSqYox68U1TCg1vej1pLT8wuUavbpJUe0bwQXeDjtGbKuj6ytRdG3odVsHe4R5uAwEB8QX-NodS-Z5Njw49jgciolg_9vk1Hzlpv5nJt4BWp0i94
Cites_doi	10.1016/j.cie.2006.01.005 10.24846/v22i3y201302 10.1080/00207543.2014.978030 10.1080/00207543.2022.2140221 10.1007/s11518-008-5072-z 10.1287/educ.2014.0128 10.1007/978-3-319-23350-5_4 10.1007/s10845-019-01531-7 10.1016/S0925-5273(00)00180-8 10.1016/j.ijpe.2008.11.015 10.1080/00207540903348346 10.1109/TCYB.2020.2977374 10.1007/978-1-4419-8939-0_7 10.1016/j.ijpe.2003.08.008 10.1080/00207543.2014.916429 10.3390/su12104075 10.1016/j.ijpe.2010.04.042 10.1016/j.ejor.2011.07.007 10.1109/MSP.2017.2743240 10.3390/math11010042 10.1016/j.ijpe.2008.06.005 10.1016/j.cor.2009.06.002 10.1080/00207543.2022.2104180 10.1016/j.engappai.2008.10.012 10.1016/j.dajour.2024.100508 10.1016/j.cor.2011.07.021 10.1016/j.ifacol.2015.06.090 10.1016/j.ijpe.2011.02.009 10.1080/0740817X.2015.1063793 10.1016/S0925-5273(02)00296-7 10.1007/978-3-030-49724-8_10 10.1287/opre.1080.0580 10.1109/WSC60868.2023.10407308 10.1016/j.arcontrol.2007.02.007 10.1016/j.ejor.2016.02.021 10.1007/s11590-016-1009-5 10.1016/j.ijpe.2017.11.012 10.1016/j.ejor.2010.08.007 10.1016/j.ejor.2018.07.014 10.3182/20130619-3-RU-3018.00335
ContentType	Journal Article
Copyright	COPYRIGHT 2025 MDPI AG 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.
Copyright_xml	– notice: COPYRIGHT 2025 MDPI AG – notice: 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.
DBID	AAYXX CITATION 3V. 7SC 7TB 7XB 8AL 8FD 8FE 8FG 8FK ABJCF ABUWG AFKRA ARAPS AZQEC BENPR BGLVJ CCPQU DWQXO FR3 GNUQQ HCIFZ JQ2 K7- KR7 L6V L7M L~C L~D M0N M7S P62 PHGZM PHGZT PIMPY PKEHL PQEST PQGLB PQQKQ PQUKI PRINS PTHSS Q9U DOA
DOI	10.3390/math13142229
DatabaseName	CrossRef ProQuest Central (Corporate) Computer and Information Systems Abstracts Mechanical & Transportation Engineering Abstracts ProQuest Central (purchase pre-March 2016) Computing Database (Alumni Edition) Technology Research Database ProQuest SciTech Collection ProQuest Technology Collection ProQuest Central (Alumni) (purchase pre-March 2016) Materials Science & Engineering Collection ProQuest Central (Alumni) ProQuest Central UK/Ireland Advanced Technologies & Aerospace Collection ProQuest Central Essentials - QC ProQuest Central Technology Collection ProQuest One ProQuest Central Engineering Research Database ProQuest Central Student SciTech Premium Collection ProQuest Computer Science Collection Computer Science Database Civil Engineering Abstracts ProQuest Engineering Collection Advanced Technologies Database with Aerospace Computer and Information Systems Abstracts Academic Computer and Information Systems Abstracts Professional Computing Database Engineering Database ProQuest Advanced Technologies & Aerospace Collection ProQuest Central Premium ProQuest One Academic (New) Publicly Available Content Database ProQuest One Academic Middle East (New) ProQuest One Academic Eastern Edition (DO NOT USE) ProQuest One Applied & Life Sciences ProQuest One Academic ProQuest One Academic UKI Edition ProQuest Central China Engineering Collection ProQuest Central Basic DOAJ Directory of Open Access Journal Collection
DatabaseTitle	CrossRef Publicly Available Content Database Computer Science Database ProQuest Central Student Technology Collection Technology Research Database Computer and Information Systems Abstracts – Academic ProQuest One Academic Middle East (New) Mechanical & Transportation Engineering Abstracts ProQuest Advanced Technologies & Aerospace Collection ProQuest Central Essentials ProQuest Computer Science Collection Computer and Information Systems Abstracts ProQuest Central (Alumni Edition) SciTech Premium Collection ProQuest One Community College ProQuest Central China ProQuest Central ProQuest One Applied & Life Sciences ProQuest Engineering Collection ProQuest Central Korea ProQuest Central (New) Advanced Technologies Database with Aerospace Engineering Collection Advanced Technologies & Aerospace Collection Civil Engineering Abstracts ProQuest Computing Engineering Database ProQuest Central Basic ProQuest Computing (Alumni Edition) ProQuest One Academic Eastern Edition ProQuest Technology Collection ProQuest SciTech Collection Computer and Information Systems Abstracts Professional ProQuest One Academic UKI Edition Materials Science & Engineering Collection Engineering Research Database ProQuest One Academic ProQuest One Academic (New) ProQuest Central (Alumni)
DatabaseTitleList	CrossRef Publicly Available Content Database
Database_xml	– sequence: 1 dbid: DOA name: DOAJ Directory of Open Access Journals url: https://www.doaj.org/ sourceTypes: Open Website – sequence: 2 dbid: 8FG name: ProQuest Technology Collection url: https://search.proquest.com/technologycollection1 sourceTypes: Aggregation Database
DeliveryMethod	fulltext_linktorsrc
Discipline	Mathematics
EISSN	2227-7390
ExternalDocumentID	oai_doaj_org_article_cd816afe255f499195061702371709eb A849986162 10_3390_math13142229
GeographicLocations	Germany
GeographicLocations_xml	– name: Germany
GroupedDBID	-~X 5VS 85S 8FE 8FG AADQD AAFWJ AAYXX ABDBF ABJCF ABPPZ ABUWG ACIPV ACIWK ADBBV AFKRA AFZYC ALMA_UNASSIGNED_HOLDINGS AMVHM ARAPS AZQEC BCNDV BENPR BGLVJ BPHCQ CCPQU CITATION DWQXO GNUQQ GROUPED_DOAJ HCIFZ IAO ITC K6V K7- KQ8 L6V M7S MODMG M~E OK1 PHGZM PHGZT PIMPY PQGLB PQQKQ PROAC PTHSS PUEGO RNS 3V. 7SC 7TB 7XB 8AL 8FD 8FK FR3 JQ2 KR7 L7M L~C L~D M0N P62 PKEHL PQEST PQUKI PRINS Q9U
ID	FETCH-LOGICAL-c365t-6d2e1c1c4e5fe7d178d97204a3713ac12a6ff67bd81fee216a5998b98ffbe8143
IEDL.DBID	DOA
ISSN	2227-7390
IngestDate	Wed Aug 27 01:31:39 EDT 2025 Fri Jul 25 18:48:36 EDT 2025 Tue Aug 05 03:51:06 EDT 2025 Wed Sep 10 05:48:56 EDT 2025
IsDoiOpenAccess	true
IsOpenAccess	true
IsPeerReviewed	true
IsScholarly	true
Issue	14
Language	English
License	https://creativecommons.org/licenses/by/4.0
LinkModel	DirectLink
MergedId	FETCHMERGED-LOGICAL-c365t-6d2e1c1c4e5fe7d178d97204a3713ac12a6ff67bd81fee216a5998b98ffbe8143
Notes	ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14
ORCID	0009-0006-1673-8853 0009-0001-6228-768X 0009-0003-4165-633X 0009-0002-9128-0484
OpenAccessLink	https://doaj.org/article/cd816afe255f499195061702371709eb
PQID	3233232081
PQPubID	2032364
ParticipantIDs	doaj_primary_oai_doaj_org_article_cd816afe255f499195061702371709eb proquest_journals_3233232081 gale_infotracacademiconefile_A849986162 crossref_primary_10_3390_math13142229
PublicationCentury	2000
PublicationDate	2025-07-01
PublicationDateYYYYMMDD	2025-07-01
PublicationDate_xml	– month: 07 year: 2025 text: 2025-07-01 day: 01
PublicationDecade	2020
PublicationPlace	Basel
PublicationPlace_xml	– name: Basel
PublicationTitle	Mathematics (Basel)
PublicationYear	2025
Publisher	MDPI AG
Publisher_xml	– name: MDPI AG
References	Hnaien (ref_18) 2008; 17 Ammar (ref_12) 2015; 48 Rolf (ref_33) 2023; 61 Keswani (ref_44) 2024; 12 Lee (ref_4) 2006; 51 ref_36 ref_13 Hnaien (ref_26) 2010; 37 Sakiani (ref_2) 2012; 39 ref_32 Badhan (ref_43) 2022; 1 Hill (ref_1) 2018; 196 Dolgui (ref_16) 2004; 90 Tang (ref_23) 2003; 81 Levi (ref_28) 2008; 56 Lamouri (ref_35) 2020; 31 ref_39 Dolgui (ref_15) 2002; 78 ref_38 (ref_11) 2010; 48 Shapiro (ref_30) 2011; 209 Dolgui (ref_8) 2007; 31 Danilovic (ref_21) 2014; 52 Ji (ref_7) 2016; 253 Wazed (ref_9) 2009; 3 Powell (ref_31) 2019; 275 ref_47 Zhang (ref_3) 2011; 215 ref_45 Bushuev (ref_48) 2015; 38 ref_42 ref_41 ref_40 Louly (ref_17) 2008; 115 Ammar (ref_27) 2013; 46 Lisec (ref_46) 2017; 11 Louly (ref_20) 2013; 143 ref_29 Esteso (ref_37) 2023; 61 Chauhan (ref_22) 2009; 120 Louly (ref_19) 2011; 131 Karimi (ref_24) 2011; 38 Hamta (ref_10) 2015; 53 Dolgui (ref_14) 2013; 22 Hnaien (ref_25) 2009; 22 ref_5 Pan (ref_6) 2016; 48 Nguyen (ref_34) 2020; 50
References_xml	– volume: 51 start-page: 257 year: 2006 ident: ref_4 article-title: A study on inventory replenishment policies in a two-echelon supply chain system publication-title: Comput. Ind. Eng. doi: 10.1016/j.cie.2006.01.005 – volume: 22 start-page: 255 year: 2013 ident: ref_14 article-title: A state of the art on supply planning and inventory control under lead time uncertainty publication-title: Stud. Inform. Control doi: 10.24846/v22i3y201302 – volume: 53 start-page: 2970 year: 2015 ident: ref_10 article-title: Supply chain network optimization considering assembly line balancing and demand uncertainty publication-title: Int. J. Prod. Res. doi: 10.1080/00207543.2014.978030 – volume: 38 start-page: 13549 year: 2011 ident: ref_24 article-title: A hybrid multi-objective genetic algorithm for planning order release date in two-level assembly system with random lead times publication-title: Expert Syst. Appl. – volume: 61 start-page: 7151 year: 2023 ident: ref_33 article-title: A review on reinforcement learning algorithms and applications in supply chain management publication-title: Int. J. Prod. Res. doi: 10.1080/00207543.2022.2140221 – volume: 17 start-page: 132 year: 2008 ident: ref_18 article-title: Planned lead time optimization in material requirement planning environment for multilevel production systems publication-title: J. Syst. Sci. Syst. Eng. doi: 10.1007/s11518-008-5072-z – ident: ref_29 doi: 10.1287/educ.2014.0128 – ident: ref_13 doi: 10.1007/978-3-319-23350-5_4 – volume: 1 start-page: 61 year: 2022 ident: ref_43 article-title: Enhancing Operational Efficiency: A Comprehensive Analysis of Machine Learning Integration in Industrial Automation publication-title: J. Bus. Insight Innov. – volume: 31 start-page: 1531 year: 2020 ident: ref_35 article-title: Machine learning applied in production planning and control: A state-of-the-art in the era of industry 4.0 publication-title: J. Intell. Manuf. doi: 10.1007/s10845-019-01531-7 – volume: 78 start-page: 145 year: 2002 ident: ref_15 article-title: A model for supply planning under lead time uncertainty publication-title: Int. J. Prod. Econ. doi: 10.1016/S0925-5273(00)00180-8 – volume: 120 start-page: 411 year: 2009 ident: ref_22 article-title: A continuous model for supply planning of assembly systems with stochastic component procurement times publication-title: Int. J. Prod. Econ. doi: 10.1016/j.ijpe.2008.11.015 – volume: 48 start-page: 7463 year: 2010 ident: ref_11 article-title: Safety stock or safety lead time: Coping with unreliability in demand and supply publication-title: Int. J. Prod. Res. doi: 10.1080/00207540903348346 – volume: 50 start-page: 3826 year: 2020 ident: ref_34 article-title: Deep reinforcement learning for multiagent systems: A review of challenges, solutions, and applications publication-title: IEEE Trans. Cybern. doi: 10.1109/TCYB.2020.2977374 – ident: ref_5 doi: 10.1007/978-1-4419-8939-0_7 – ident: ref_40 – volume: 90 start-page: 369 year: 2004 ident: ref_16 article-title: The MPS parameterization under lead time uncertainty publication-title: Int. J. Prod. Econ. doi: 10.1016/j.ijpe.2003.08.008 – volume: 52 start-page: 4007 year: 2014 ident: ref_21 article-title: A novel relational approach for assembly system supply planning under environmental uncertainty publication-title: Int. J. Prod. Res. doi: 10.1080/00207543.2014.916429 – ident: ref_42 – ident: ref_45 doi: 10.3390/su12104075 – volume: 131 start-page: 76 year: 2011 ident: ref_19 article-title: Optimal time phasing and periodicity for MRP with POQ policy publication-title: Int. J. Prod. Econ. doi: 10.1016/j.ijpe.2010.04.042 – volume: 215 start-page: 590 year: 2011 ident: ref_3 article-title: Collaborative production planning of supply chain under price and demand uncertainty publication-title: Eur. J. Oper. Res. doi: 10.1016/j.ejor.2011.07.007 – ident: ref_41 doi: 10.1109/MSP.2017.2743240 – ident: ref_47 doi: 10.3390/math11010042 – volume: 3 start-page: 342 year: 2009 ident: ref_9 article-title: Uncertainty factors in real manufacturing environment publication-title: Aust. J. Basic Appl. Sci. – volume: 38 start-page: 283 year: 2015 ident: ref_48 article-title: A review of inventory lot sizing review papers publication-title: Manag. Res. Rev. – volume: 115 start-page: 236 year: 2008 ident: ref_17 article-title: Supply planning for single-level assembly system with stochastic component delivery times and service-level constraint publication-title: Int. J. Prod. Econ. doi: 10.1016/j.ijpe.2008.06.005 – volume: 37 start-page: 1835 year: 2010 ident: ref_26 article-title: Multi-objective optimization for inventory control in two-level assembly systems under uncertainty of lead times publication-title: Comput. Oper. Res. doi: 10.1016/j.cor.2009.06.002 – volume: 61 start-page: 5772 year: 2023 ident: ref_37 article-title: Reinforcement learning applied to production planning and control publication-title: Int. J. Prod. Res. doi: 10.1080/00207543.2022.2104180 – volume: 22 start-page: 906 year: 2009 ident: ref_25 article-title: Genetic algorithm for supply planning in two-level assembly systems with random lead times publication-title: Eng. Appl. Artif. Intell. doi: 10.1016/j.engappai.2008.10.012 – volume: 12 start-page: 100508 year: 2024 ident: ref_44 article-title: A comparative analysis of metaheuristic algorithms in interval-valued sustainable economic production quantity inventory models using center-radius optimization publication-title: Decis. Anal. J. doi: 10.1016/j.dajour.2024.100508 – volume: 39 start-page: 1325 year: 2012 ident: ref_2 article-title: Multi-objective supply planning for two-level assembly systems with stochastic lead times publication-title: Comput. Oper. Res. doi: 10.1016/j.cor.2011.07.021 – volume: 48 start-page: 254 year: 2015 ident: ref_12 article-title: Supply planning for multi-levels assembly system under random lead times publication-title: IFAC-PapersOnLine doi: 10.1016/j.ifacol.2015.06.090 – volume: 143 start-page: 35 year: 2013 ident: ref_20 article-title: Optimal MRP parameters for a single item inventory with random replenishment lead time, POQ policy and service level constraint publication-title: Int. J. Prod. Econ. doi: 10.1016/j.ijpe.2011.02.009 – volume: 48 start-page: 267 year: 2016 ident: ref_6 article-title: Component procurement strategies in decentralized assembly systems under supply uncertainty publication-title: IIE Trans. doi: 10.1080/0740817X.2015.1063793 – volume: 81 start-page: 415 year: 2003 ident: ref_23 article-title: The detailed coordination problem in a two-level assembly system with stochastic lead times publication-title: Int. J. Prod. Econ. doi: 10.1016/S0925-5273(02)00296-7 – ident: ref_32 doi: 10.1007/978-3-030-49724-8_10 – volume: 56 start-page: 1184 year: 2008 ident: ref_28 article-title: Approximation algorithms for capacitated stochastic inventory control models publication-title: Oper. Res. doi: 10.1287/opre.1080.0580 – ident: ref_38 – ident: ref_39 doi: 10.1109/WSC60868.2023.10407308 – volume: 31 start-page: 269 year: 2007 ident: ref_8 article-title: Supply planning under uncertainties in MRP environments: A state of the art publication-title: Annu. Rev. Control doi: 10.1016/j.arcontrol.2007.02.007 – ident: ref_36 – volume: 253 start-page: 383 year: 2016 ident: ref_7 article-title: Optimal production planning for assembly systems with uncertain capacities and random demand publication-title: Eur. J. Oper. Res. doi: 10.1016/j.ejor.2016.02.021 – volume: 11 start-page: 1137 year: 2017 ident: ref_46 article-title: A metaheuristic approach to solving a multiproduct EOQ-based inventory problem with storage space constraints publication-title: Optim. Lett. doi: 10.1007/s11590-016-1009-5 – volume: 196 start-page: 12 year: 2018 ident: ref_1 article-title: Collaborative planning, forecasting, and replenishment & firm performance: An empirical evaluation publication-title: Int. J. Prod. Econ. doi: 10.1016/j.ijpe.2017.11.012 – volume: 209 start-page: 63 year: 2011 ident: ref_30 article-title: Analysis of stochastic dual dynamic programming method publication-title: Eur. J. Oper. Res. doi: 10.1016/j.ejor.2010.08.007 – volume: 275 start-page: 795 year: 2019 ident: ref_31 article-title: A unified framework for stochastic optimization publication-title: Eur. J. Oper. Res. doi: 10.1016/j.ejor.2018.07.014 – volume: 46 start-page: 389 year: 2013 ident: ref_27 article-title: Mathematical model for supply planning of multi-level assembly systems with stochastic lead times publication-title: IFAC Proc. Vol. doi: 10.3182/20130619-3-RU-3018.00335
SSID	ssj0000913849
Score	2.2953284
Snippet	This study presents a reinforcement learning–based approach to optimize replenishment policies in the presence of uncertainty, with the objective of minimizing...
SourceID	doaj proquest gale crossref
SourceType	Open Website Aggregation Database Index Database
StartPage	2229
SubjectTerms	Algorithms Artificial intelligence assembly system Convergence (Social sciences) Costs Customer satisfaction Decision making Deep learning deep reinforcement learning Demand Design Fines & penalties Inventory Inventory control Inventory management Lead time Linear programming Logistics Manufacturing Markov analysis Markov processes Modular structures Neural networks Optimization Order quantity Policies Production planning Random variables Randomness Raw materials Replenishment replenishment planning Shortages stochastic demand Suppliers Supply chains uncertain lead times
SummonAdditionalLinks	– databaseName: ProQuest Technology Collection dbid: 8FG link: http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwfV1LbxMxELagXOCAKA-RUiofQJxWre1dr32qCm2okAoStFJvlh_jplKbpMly4N93xnECHEDak9daWzP2zHxj7zeMvfMhii5l2Ug4aJsWpGpM6kJDTHh9TCL1QHnIs6_69KL9ctld1oTbsl6rXNvEYqjTLFKOfF9JhY9ED3Y4v2uoahSdrtYSGg_ZI4Gehta5GX_e5FiI89K0dnXfXSG638cocCJUyXvYvzxRIez_l1kuvmb8jD2tQSI_Wml1mz2A6XP25GzDsLp8wc6PAeb8OxTi01hyfLxypV5xbOLf0Bbc4kcwwkbHcr2clC7XU_5jmMWJJ3pmTie-t-HmF6-85S_Zxfjk_NNpUyskNFHpbmh0kiCiiC10GfokepMsVZ3xCrGnj0J6nbPuQzIiA0ihfYfwKliTcwCDodIrtjWdTeE14_mglypZBGDGIuYDq3TKJijrwfcmhxF7v5aWm6-IMBwCCJKq-1OqI_aRRLnpQ_TVpWG2uHJ1N7iIE9I-A-KZjJCLStESMbzEafc4Ng72gRThSIbDwkdf_xXAqRJdlTtCrVqjhZYjtrvWlau7b-l-r5Wd_79-wx5Lqudbrt_usq1h8RPeYpAxhL2yku4B1YbRfw priority: 102 providerName: ProQuest
Title	Deep Reinforcement Learning for Optimal Replenishment in Stochastic Assembly Systems
URI	https://www.proquest.com/docview/3233232081 https://doaj.org/article/cd816afe255f499195061702371709eb
Volume	13
hasFullText	1
inHoldings	1
isFullTextHit
isPrint
link	http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwrV1NT9wwEB1RuLQH1NJWXUpXPrTqKSK2E8c-QssWVQIqChI3yx_jLhIsiE0P_fcdOwEtB9RLpZwsK7LeJOP3nMkbgI_OB97GJCqBdVM1KGSlY-ur7ITXhchjh_kc8uhYHZ433y_ai5VWX7kmbLAHHoDbDVFz5RIS9U3EznPX0uwhLiTpkNqgz9m3NvWKmCo52HCpGzNUukvS9bvE_-ZclhMP82gPKlb9TyXkssvMXsLmSA_Z3rCsV7CGiy14cfTgrbp8DWdfEW_ZKRbL01BO99jokvqL0RA7oSxwTTchbk1byuVyXqZcLtjP_ibMXTZmZvlb77W_-sNGx_I3cD47OPtyWI29EaogVdtXKgrkgYcG24Rd5J2OJvebcYSLdIELp1JSnSf0EqIgCFsSVt7olDxqIklvYX1xs8B3wFLdCRkNSS9tSO2hkSom7aVx6Dqd_AQ-3aNlbwcLDEvSIaNqV1GdwH6G8mFONq4uAxROO4bT_iucE_icA2Ezhv2dC278S4CWmo2q7B5F1WjFlZjAzn2s7PjeLa0Uki5BPGf7f6zmPTwXud9vKc_dgfX-7jd-IBLS-yk807NvU9jYPzj-cTotT99f2z_bSw
linkProvider	Directory of Open Access Journals
linkToHtml	http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwtV1LbxMxEB6V9AAcEE8RKOADFadVY3vXax8QammrlDYBlVTqzXj9aCrRJCSLUP8Uv5GZzSbAAW6V9uS1bGs89sw3tr8BeO0qz4uQRCZiL8_yKGSmQ1FlxIRX-sBDGSkOORiq_ln-4bw434Cfq7cwdK1ytSc2G3WYeoqR70gh8RNowd7NvmWUNYpOV1cpNJZqcRyvfyBkW7w92sf53Rbi8GD0vp-1WQUyL1VRZyqIyD33eSxSLAMvdTCUqcVJxGvOc-FUSqqsguYpRsGVKxCSVEanVEWN7gW2ews2c3rR2oHNvYPhp9N1VIdYNnVuljfspTS9HfQ7x1w2kRbzl-1rUgT8yxA01u3wPtxr3VK2u9SjB7ARJw_h7mDN6bp4BKP9GGfsNDZUq76JKrKWnfWCYRH7iLvPFTaCPj2assvFuKlyOWGf66kfOyKEZnTGfFV9vWYtU_pjOLsR6T2BzmQ6iU-BpV4pZDAI-bRBlBmNVCHpShoXXalT1YXtlbTsbEm9YRGykFTtn1Ltwh6Jcl2HCLObgun8wrbrz3ockHIpIoJKCPIo-S1R0Qscdol9Y2dvaCIsybCeO-_a1wk4VCLIsrs4q0YrrkQXtlZzZdv1vrC_tfPZ_3-_gtv90eDEnhwNj5_DHUHZhJvLv1vQqeff4wt0cerqZatXDL7ctCr_AqyCEJE
linkToPdf	http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwtV1LbxMxELbaIqFyQDxaESjgAxWnVWJ7148DQoUQWkoLglbqzXj9aCrRJCSLUP8av44ZZzfAAW6V9uS1bGs89sw3tr8h5JmrPatC4gWPg7IoIxeFDlVdIBOe8oEFFTEOeXQs90_Ld2fV2Rr52b2FwWuV3Z6YN-ow9Rgj7wsu4ONgwfqpvRbxcTh6OftWYAYpPGnt0mksVeQwXv0A-LZ4cTCEud7lfPTm5PV-0WYYKLyQVVPIwCPzzJexSlEFpnQwmLXFCcBuzjPuZEpS1UGzFCNn0lUAT2qjU6qjBlcD2l0nN5RQBoGfHr1dxXeQb1OXZnnXXggz6IMHOmYix1zMX1YwJwv4l0nIdm50h9xuHVS6t9Sou2QtTu6RW0crdtfFfXIyjHFGP8VMuupzfJG2PK3nFIroB9iHLqER8O7BqF0sxrnKxYR-bqZ-7JAamuJp82X99Yq2nOlb5PRaZLdNNibTSXxAaBooLoIB8KcN4M1ohAxJ18K46JROdY_sdtKysyUJhwXwglK1f0q1R16hKFd1kDo7F0zn57ZdidbDgKRLEbBUAriHaXCRlJ7DsBX0DZ09x4mwKMNm7rxr3ynAUJEqy-7BrBotmeQ9stPNlW1X_sL-1tOH___9lNwEBbbvD44PH5FNjmmF8y3gHbLRzL_Hx-DrNPWTrFSUfLluLf4FXPYTYQ
openUrl	ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Deep+Reinforcement+Learning+for+Optimal+Replenishment+in+Stochastic+Assembly+Systems&rft.jtitle=Mathematics+%28Basel%29&rft.au=Lativa+Sid+Ahmed+Abdellahi&rft.au=Zeinebou+Zoubeir&rft.au=Yahya+Mohamed&rft.au=Ahmedou+Haouba&rft.date=2025-07-01&rft.pub=MDPI+AG&rft.eissn=2227-7390&rft.volume=13&rft.issue=14&rft.spage=2229&rft_id=info:doi/10.3390%2Fmath13142229&rft.externalDBID=DOA&rft.externalDocID=oai_doaj_org_article_cd816afe255f499195061702371709eb
thumbnail_l	http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=2227-7390&client=summon
thumbnail_m	http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=2227-7390&client=summon
thumbnail_s	http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=2227-7390&client=summon