Scalable Scheduling of Semiconductor Packaging Facilities Using Deep Reinforcement Learning

Reinforcement learning (RL) has emerged as a promising approach for scheduling semiconductor operations. Yet, it is still challenging to solve large-scale scheduling problems based on an RL method since learning complexity grows fast as the size of shop floor increases. This challenge becomes more a...

Full description

Saved in:

Bibliographic Details
Published in	IEEE transactions on cybernetics Vol. 53; no. 6; pp. 3518 - 3531
Main Authors	Park, In-Beom, Park, Jonghun
Format	Journal Article
Language	English
Published	United States IEEE 01.06.2023 The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Subjects	Aerospace electronics Artificial neural networks Deep deterministic policy gradient (DDPG) Deep learning deep reinforcement learning (DRL) flexible job shop scheduling Heuristic methods Job shop scheduling Metaheuristics Packaging Processor scheduling Production Representations Scheduling semiconductor manufacturing system semiconductor packaging
Online Access	Get full text

Cover

Loading…

More Information
Summary:	Reinforcement learning (RL) has emerged as a promising approach for scheduling semiconductor operations. Yet, it is still challenging to solve large-scale scheduling problems based on an RL method since learning complexity grows fast as the size of shop floor increases. This challenge becomes more apparent when solving the scheduling problems with a diverse number of job types, which leads to the difficulties in exploration and function approximation in RL. This article presents a scheduling method for semiconductor packaging facilities using deep RL in which an agent allocates a job to one of machines in a centralized manner. Specifically, a novel state representation is introduced to effectively accommodate the variations in the number of available machines and the production requirements. Furthermore, we propose a continuous representation of an action to maintain the size of the action space even when the numbers of jobs, machines, and operation types are subject to change. Extensive experiments on large-scale datasets demonstrate that the proposed method mostly outperforms the metaheuristics and rule-based methods, as well as the other RL approaches considered in terms of makespan while requiring much less computation time than the metaheuristics.
Bibliography:	ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14 content type line 23
ISSN:	2168-2267 2168-2275 2168-2275
DOI:	10.1109/TCYB.2021.3128075