Reinforcement learning for online optimization of job-shop scheduling in a smart manufacturing factory

The job-shop scheduling problem (JSSP) is a complex combinatorial problem, especially in dynamic environments. Low-volume-high-mix orders contain various design specifications that bring a large number of uncertainties to manufacturing systems. Traditional scheduling methods are limited in handling...

Full description

Saved in:
Bibliographic Details
Published inAdvances in mechanical engineering Vol. 14; no. 3
Main Authors Zhou, Tong, Zhu, Haihua, Tang, Dunbing, Liu, Changchun, Cai, Qixiang, Shi, Wei, Gui, Yong
Format Journal Article
LanguageEnglish
Published London, England SAGE Publications 01.03.2022
Sage Publications Ltd
SAGE Publishing
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:The job-shop scheduling problem (JSSP) is a complex combinatorial problem, especially in dynamic environments. Low-volume-high-mix orders contain various design specifications that bring a large number of uncertainties to manufacturing systems. Traditional scheduling methods are limited in handling diverse manufacturing resources in a dynamic environment. In recent years, artificial intelligence (AI) arouses the interests of researchers in solving dynamic scheduling problems. However, it is difficult to optimize the scheduling policies for online decision making while considering multiple objectives. Therefore, this paper proposes a smart scheduler to handle real-time jobs and unexpected events in smart manufacturing factories. New composite reward functions are formulated to improve the decision-making abilities and learning efficiency of the smart scheduler. Based on deep reinforcement learning (RL), the smart scheduler autonomously learns to schedule manufacturing resources in real time and improve its decision-making abilities dynamically. We evaluate and validate the proposed scheduling model with a series of experiments on a smart factory testbed. Experimental results show that the smart scheduler not only achieves good learning and scheduling performances by optimizing the composite reward functions, but also copes with unexpected events (e.g. urgent or simultaneous orders, machine failures) and balances between efficiency and profits.
ISSN:1687-8132
1687-8140
DOI:10.1177/16878132221086120