Model-based offline reinforcement learning framework for optimizing tunnel boring machine operation

Research on automation and intelligent operation of tunnel boring machine (TBM) is receiving more and more attention, benefiting from the increasing construction data. However, most studies on TBM operations optimization were trained by the labels of human drivers’ decisions, which were subjective a...

Full description

Saved in:
Bibliographic Details
Published inUnderground space (Beijing) Vol. 19; pp. 47 - 71
Main Authors Cao, Yupeng, Luo, Wei, Xue, Yadong, Lin, Weiren, Zhang, Feng
Format Journal Article
LanguageEnglish
Published Shanghai Elsevier B.V 01.12.2024
KeAi Publishing Communications Ltd
KeAi Communications Co., Ltd
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:Research on automation and intelligent operation of tunnel boring machine (TBM) is receiving more and more attention, benefiting from the increasing construction data. However, most studies on TBM operations optimization were trained by the labels of human drivers’ decisions, which were subjective and stochastic. As a result, the control parameters suggested by these models could hardly surpass the performance of a human driver, even the possibility of subjective incorrect decisions. Considering that the geomechanical feedback to TBM under drivers’ actions is objective, in this paper, a transformer-based model called the geological response for tunnel boring machine (GRTBM), is proposed to learn the relationship between operation-adjust and TBM monitoring changes. Additionally, with the model-based offline reinforcement learning, this paper provided a novel approach to optimizing the TBM excavation operations. The decision processes, recorded in the Yin-song TBM project for a waterway tunnel in Jilin Province of China, were used for the validation of the model. By adopting an implicit perception of geological conditions in the GRTBM model, the suggested method achieved the desired state within a single action, greatly outperformed the practical adjustments where 500 s were taken, revealing the fact that the proposed model has the potential to surpass the capability of human beings.
ISSN:2467-9674
2096-2754
2467-9674
DOI:10.1016/j.undsp.2024.01.008