Dynamic replenishment policy for perishable goods using change point detection-based soft actor-critic reinforcement learning

This paper examines the problem of establishing a dynamic replenishment policy that minimizes the costs associated with selling perishable goods. The perishable inventory is highly desired to match the realized demand. However, the demand exhibits significant non-stationarity, which is characterized...

Full description

Saved in:
Bibliographic Details
Published inExpert systems with applications Vol. 270; p. 126556
Main Authors Kou, Aiqing, Cheng, Yan, Huang, Xiangyu, Jin, Jing
Format Journal Article
LanguageEnglish
Published Elsevier Ltd 25.04.2025
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:This paper examines the problem of establishing a dynamic replenishment policy that minimizes the costs associated with selling perishable goods. The perishable inventory is highly desired to match the realized demand. However, the demand exhibits significant non-stationarity, which is characterized by the dynamic change of stochastic demand distribution patterns. In this paper, the replenishment problem is modeled as a non-stationary Markov decision process (NSMDP) with unknown transition probabilities, and a deep reinforcement learning (DRL)-based solution framework is proposed for the NSMDP model. In this framework, the feature-enhanced long short-term memory (LSTM) is employed to detect change points in real time. On this basis, the paper develops a change point detection-based soft actor-critic (CPD-SAC) algorithm that dynamically adjusts replenishment decisions to adapt to different states across various stochastic demand distribution patterns. The numerical experiments first analyze the effect of sliding window selection on the accuracy of change point detection (CPD). Furthermore, the proposed approach is compared against several benchmark DRL algorithms and the static base stock policy. Finally, a sensitivity analysis is conducted on key parameters, including lead time, lifetime, and unit shortage cost for perishable goods. The results confirm the effectiveness of the proposed approach and demonstrate the applicability scenarios for the dynamic replenishment policy.
ISSN:0957-4174
DOI:10.1016/j.eswa.2025.126556