State-Dependent Parameter Tuning of the Apparent Tardiness Cost Dispatching Rule Using Deep Reinforcement Learning
The apparent tardiness cost (ATC) is a dispatching rule that demonstrates excellent performance in minimizing the total weighted tardiness (TWT) in single-machine scheduling. The ATC rule's performance is dependent on the lookahead parameter of an equation that calculates the job priority index...
Saved in:
Published in | IEEE access Vol. 10; pp. 20187 - 20198 |
---|---|
Main Authors | , |
Format | Journal Article |
Language | English |
Published |
Piscataway
IEEE
2022
The Institute of Electrical and Electronics Engineers, Inc. (IEEE) |
Subjects | |
Online Access | Get full text |
Cover
Loading…
Summary: | The apparent tardiness cost (ATC) is a dispatching rule that demonstrates excellent performance in minimizing the total weighted tardiness (TWT) in single-machine scheduling. The ATC rule's performance is dependent on the lookahead parameter of an equation that calculates the job priority index. Existing studies recommend a fixed value or a value derived through a handcrafted function as an estimate of the lookahead parameter. However, such parameter estimation inevitably entails information loss from using summarized job data and generates an inferior schedule. This study proposes a reinforcement learning-based ATC dispatching rule that estimates the lookahead parameter directly from raw job data (processing time, weight, and slack time). The scheduling agent learns the relationship between raw job data and the continuous lookahead parameter while interacting with the scheduling environment using a deep deterministic policy gradient (DDPG) algorithm. We trained the DDPG model to minimize the TWT through a simulation in a single-machine scheduling problem with unequal job arrival times. Based on a preliminary experiment, we verified that the proposed dispatching rule, ATC-DDPG, successfully performed intelligent state-dependent parameter tuning. ATC-DDPG also displayed the best performance in the main experiment, which compared the performance with five existing dispatching rules. |
---|---|
ISSN: | 2169-3536 2169-3536 |
DOI: | 10.1109/ACCESS.2022.3152192 |