Batch process control based on reinforcement learning with segmented prioritized experience replay

Abstract Batch process is difficult to control accurately due to their complex nonlinear dynamics and unstable operating conditions. The traditional methods such as model predictive control, will seriously affect control performance when process model is inaccurate. In contrast, reinforcement learni...

Full description

Saved in:
Bibliographic Details
Published inMeasurement science & technology Vol. 35; no. 5; p. 56202
Main Authors Xu, Chen, Ma, Junwei, Tao, Hongfeng
Format Journal Article
LanguageEnglish
Published 01.05.2024
Online AccessGet full text

Cover

Loading…
Abstract Abstract Batch process is difficult to control accurately due to their complex nonlinear dynamics and unstable operating conditions. The traditional methods such as model predictive control, will seriously affect control performance when process model is inaccurate. In contrast, reinforcement learning (RL) provides an viable alternative by interacting directly with the environment to learn optimal strategy. This paper proposes a batch process controller based on the segmented prioritized experience replay (SPER) soft actor-critic (SAC). SAC combines off-policy updates and maximum entropy RL with an actor-critic formulation, which can obtain a more robust control strategy than other RL methods. To improve the efficiency of the experience replay mechanism in tasks with long episodes and multiple phases, a new method of sampling experience called SPER is designed in SAC. In addition, a novel reward function is set for the SPER-SAC based controller to deal with the sparse reward. Finally, the effectiveness of the SPER-SAC based controller for batch process examples is demonstrated by comparing with the conventional RL-based control methods.
AbstractList Abstract Batch process is difficult to control accurately due to their complex nonlinear dynamics and unstable operating conditions. The traditional methods such as model predictive control, will seriously affect control performance when process model is inaccurate. In contrast, reinforcement learning (RL) provides an viable alternative by interacting directly with the environment to learn optimal strategy. This paper proposes a batch process controller based on the segmented prioritized experience replay (SPER) soft actor-critic (SAC). SAC combines off-policy updates and maximum entropy RL with an actor-critic formulation, which can obtain a more robust control strategy than other RL methods. To improve the efficiency of the experience replay mechanism in tasks with long episodes and multiple phases, a new method of sampling experience called SPER is designed in SAC. In addition, a novel reward function is set for the SPER-SAC based controller to deal with the sparse reward. Finally, the effectiveness of the SPER-SAC based controller for batch process examples is demonstrated by comparing with the conventional RL-based control methods.
Author Tao, Hongfeng
Xu, Chen
Ma, Junwei
Author_xml – sequence: 1
  givenname: Chen
  orcidid: 0000-0002-5399-5297
  surname: Xu
  fullname: Xu, Chen
– sequence: 2
  givenname: Junwei
  surname: Ma
  fullname: Ma, Junwei
– sequence: 3
  givenname: Hongfeng
  orcidid: 0000-0001-5279-2458
  surname: Tao
  fullname: Tao, Hongfeng
BookMark eNo9kE1LAzEQhoNUsK3ePeYPrJ1J9iN71OIXFLzoeclmJ21kmyzJgtZf7y4VTzM8vO8wPCu28METY7cIdwhKbVCWmJUF4EZ3Ao29YMt_tGBLqIsqAyHlFVul9AkAFdT1krUPejQHPsRgKCVugh9j6HmrE3U8eB7JeRuioSP5kfeko3d-z7_ceOCJ9jOdgkN0IbrR_Uw7fQ8UHXlDU3no9emaXVrdJ7r5m2v28fT4vn3Jdm_Pr9v7XWawLsfMSiNUrmqprKqFLtDKqoTc6lpqA0LnylTYgVVIRhrCrhVdV5BFhTkJW8g1g_NdE0NKkWwzvXXU8dQgNLOjZhbSzEKasyP5C9Q9XzI
Cites_doi 10.1088/0957-0233/20/9/095106
10.1002/aic.14063
10.1016/j.arcontrol.2021.10.006
10.1002/aic.17658
10.1016/j.ifacol.2020.06.111
10.1016/j.compchemeng.2019.106649
10.1016/j.conengprac.2006.11.013
10.1016/j.conb.2020.08.005
10.3390/en14040997
10.1109/TII.2019.2894282
10.1016/j.neucom.2016.01.027
10.1016/j.jprocont.2013.05.007
10.1016/j.jprocont.2016.09.003
10.11992/tis.202003031
10.1016/0362-546X(89)90096-5
10.1016/j.compchemeng.2019.05.029
10.1016/j.jprocont.2010.06.007
10.1016/j.enconman.2021.114381
10.1016/j.compchemeng.2021.107489
10.1109/TIE.2016.2542134
10.1007/s10514-015-9455-y
10.1088/1361-6501/ace644
10.1016/j.neucom.2020.05.097
10.1016/j.compchemeng.2020.106886
10.1016/j.jprocont.2018.11.004
10.3390/s21082589
10.1038/nature14236
10.1007/978-3-030-60990-0_12
10.1016/j.compchemeng.2020.107133
10.1088/1361-6501/aceb82
10.1016/j.ces.2020.116171
10.1021/acs.iecr.0c02979
10.1016/j.chemolab.2019.103897
10.1021/acs.iecr.0c05678
10.1016/j.cjche.2018.06.006
10.1088/1361-6501/ab48c7
10.1177/0278364920987859
10.1016/j.compchemeng.2021.107255
ContentType Journal Article
DBID AAYXX
CITATION
DOI 10.1088/1361-6501/ad21cf
DatabaseName CrossRef
DatabaseTitle CrossRef
DatabaseTitleList CrossRef
DeliveryMethod fulltext_linktorsrc
Discipline Sciences (General)
Physics
EISSN 1361-6501
ExternalDocumentID 10_1088_1361_6501_ad21cf
GroupedDBID -DZ
-~X
.DC
1JI
4.4
5B3
5GY
5PX
5VS
5ZH
7.M
7.Q
AAGCD
AAGID
AAHTB
AAJIO
AAJKP
AATNI
AAYXX
ABCXL
ABHWH
ABJNI
ABPEJ
ABQJV
ABVAM
ACAFW
ACBEA
ACGFO
ACGFS
ACHIP
AEFHF
AENEX
AFYNE
AKPSB
ALMA_UNASSIGNED_HOLDINGS
AOAED
ASPBG
ATQHT
AVWKF
AZFZN
CBCFC
CEBXE
CITATION
CJUJL
CRLBU
CS3
DU5
EBS
EDWGO
EMSAF
EPQRW
EQZZN
F5P
HAK
IHE
IJHAN
IOP
IZVLO
KOT
LAP
N5L
N9A
P2P
PJBAE
R4D
RIN
RNS
RO9
ROL
RPA
SY9
TAE
TN5
TWZ
W28
WH7
XPP
YQT
ZMT
~02
ID FETCH-LOGICAL-c196t-f3c2848938f892a51f37604fa93ac02a48c71d0f81ec3ce1db2dd5ef1814e2f53
ISSN 0957-0233
IngestDate Fri Aug 23 03:25:32 EDT 2024
IsPeerReviewed true
IsScholarly true
Issue 5
Language English
LinkModel OpenURL
MergedId FETCHMERGED-LOGICAL-c196t-f3c2848938f892a51f37604fa93ac02a48c71d0f81ec3ce1db2dd5ef1814e2f53
ORCID 0000-0001-5279-2458
0000-0002-5399-5297
ParticipantIDs crossref_primary_10_1088_1361_6501_ad21cf
PublicationCentury 2000
PublicationDate 2024-05-01
PublicationDateYYYYMMDD 2024-05-01
PublicationDate_xml – month: 05
  year: 2024
  text: 2024-05-01
  day: 01
PublicationDecade 2020
PublicationTitle Measurement science & technology
PublicationYear 2024
References Liu (mstad21cfbib41) 2020; 196
Wen (mstad21cfbib22) 2019; 15
Brittain (mstad21cfbib32) 2015
Mahmood (mstad21cfbib38) 2014
Nian (mstad21cfbib14) 2020; 139
Oh (mstad21cfbib12) 2022; 68
Nikita (mstad21cfbib18) 2021; 230
Petsagkourakis (mstad21cfbib20) 2020; 133
Yang (mstad21cfbib37) 2020; 15
Yang (mstad21cfbib42) 2016; 190
Aumi (mstad21cfbib6) 2013; 59
Anderson (mstad21cfbib40) 2015; 39
Hong (mstad21cfbib5) 2021; 147
Lee (mstad21cfbib16) 2010; 20
Zhao (mstad21cfbib2) 2023; 34
Ionescu (mstad21cfbib29) 2021; 21
Zhang (mstad21cfbib39) 2020; 411
Brásio (mstad21cfbib9) 2016; 47
Yoo (mstad21cfbib10) 2021; 144
Haarnoja (mstad21cfbib27) 2018
Jia (mstad21cfbib7) 2018; 26
Coraci (mstad21cfbib28) 2021; 14
Lee (mstad21cfbib4) 2007; 15
Schaul (mstad21cfbib31) 2015
Bangi (mstad21cfbib25) 2021; 154
Mnih (mstad21cfbib17) 2015; 518
Chen (mstad21cfbib26) 2020; 65
Kong (mstad21cfbib8) 2019; 31
Kingma (mstad21cfbib44) 2014
Wang (mstad21cfbib33) 2019
Ibarz (mstad21cfbib23) 2021; 40
Zhang (mstad21cfbib36) 2017; 64
Singh (mstad21cfbib19) 2020; 53
Levine (mstad21cfbib46) 2020
Bao (mstad21cfbib11) 2021; 60
Shin (mstad21cfbib13) 2019; 127
Wang (mstad21cfbib15) 2023; 34
Haarnoja (mstad21cfbib35) 2018
Yoo (mstad21cfbib1) 2021; 52
Khatibisepehr (mstad21cfbib43) 2013; 23
Huang (mstad21cfbib3) 2009; 20
Zhang (mstad21cfbib21) 2021; vol 325
Ma (mstad21cfbib24) 2019; 75
Joshi (mstad21cfbib34) 2020; 59
Barron (mstad21cfbib45) 1989; 13
Zhang (mstad21cfbib30) 2021; 243
References_xml – volume: 20
  year: 2009
  ident: mstad21cfbib3
  article-title: A carrier phase batch processor for differential global positioning system: simulation and real-data results
  publication-title: Meas. Sci. Technol.
  doi: 10.1088/0957-0233/20/9/095106
  contributor:
    fullname: Huang
– volume: 59
  start-page: 2852
  year: 2013
  ident: mstad21cfbib6
  article-title: Data-driven model predictive quality control of batch processes
  publication-title: AIChE J.
  doi: 10.1002/aic.14063
  contributor:
    fullname: Aumi
– volume: 52
  start-page: 108
  year: 2021
  ident: mstad21cfbib1
  article-title: Reinforcement learning for batch process control: review and perspectives
  publication-title: Ann. Rev. Control
  doi: 10.1016/j.arcontrol.2021.10.006
  contributor:
    fullname: Yoo
– volume: 68
  year: 2022
  ident: mstad21cfbib12
  article-title: Integration of reinforcement learning and model predictive control to optimize semi-batch bioreactor
  publication-title: AIChE J.
  doi: 10.1002/aic.17658
  contributor:
    fullname: Oh
– volume: 53
  start-page: 667
  year: 2020
  ident: mstad21cfbib19
  article-title: Reinforcement learning based control of batch polymerisation processes
  publication-title: IFAC-PapersOnLine
  doi: 10.1016/j.ifacol.2020.06.111
  contributor:
    fullname: Singh
– volume: 133
  year: 2020
  ident: mstad21cfbib20
  article-title: Reinforcement learning for batch bioprocess optimization
  publication-title: Comput. Chem. Eng.
  doi: 10.1016/j.compchemeng.2019.106649
  contributor:
    fullname: Petsagkourakis
– volume: 15
  start-page: 1306
  year: 2007
  ident: mstad21cfbib4
  article-title: Iterative learning control applied to batch processes: an overview
  publication-title: Control Eng. Pract.
  doi: 10.1016/j.conengprac.2006.11.013
  contributor:
    fullname: Lee
– volume: 65
  start-page: 1
  year: 2020
  ident: mstad21cfbib26
  article-title: Actor-critic reinforcement learning in the songbird
  publication-title: Curr. Opin. Neurobiol.
  doi: 10.1016/j.conb.2020.08.005
  contributor:
    fullname: Chen
– start-page: pp 1861
  year: 2018
  ident: mstad21cfbib27
  article-title: Soft actor-critic: off-policy maximum entropy deep reinforcement learning with a stochastic actor
  contributor:
    fullname: Haarnoja
– volume: 14
  start-page: 997
  year: 2021
  ident: mstad21cfbib28
  article-title: Online implementation of a soft actor-critic agent to enhance indoor temperature control and energy efficiency in buildings
  publication-title: Energies
  doi: 10.3390/en14040997
  contributor:
    fullname: Coraci
– volume: 15
  start-page: 4969
  year: 2019
  ident: mstad21cfbib22
  article-title: Optimized adaptive nonlinear tracking control using actor-critic reinforcement learning strategy
  publication-title: IEEE Trans. Ind. Inf.
  doi: 10.1109/TII.2019.2894282
  contributor:
    fullname: Wen
– volume: 190
  start-page: 117
  year: 2016
  ident: mstad21cfbib42
  article-title: Fed-batch fermentation penicillin process fault diagnosis and detection based on support vector machine
  publication-title: Neurocomputing
  doi: 10.1016/j.neucom.2016.01.027
  contributor:
    fullname: Yang
– volume: 23
  start-page: 1575
  year: 2013
  ident: mstad21cfbib43
  article-title: Design of inferential sensors in the process industry: a review of Bayesian methods
  publication-title: J. Process Control
  doi: 10.1016/j.jprocont.2013.05.007
  contributor:
    fullname: Khatibisepehr
– year: 2015
  ident: mstad21cfbib32
  article-title: Prioritized sequence experience replay
  contributor:
    fullname: Brittain
– volume: 47
  start-page: 11
  year: 2016
  ident: mstad21cfbib9
  article-title: First principle modeling and predictive control of a continuous biodiesel plant
  publication-title: J. Process Control
  doi: 10.1016/j.jprocont.2016.09.003
  contributor:
    fullname: Brásio
– volume: 15
  start-page: 888
  year: 2020
  ident: mstad21cfbib37
  article-title: Survey of sparse reward algorithms in reinforcement learning—theory and experiment
  publication-title: CAAI Trans. Intell. Syst.
  doi: 10.11992/tis.202003031
  contributor:
    fullname: Yang
– volume: 13
  start-page: 1067
  year: 1989
  ident: mstad21cfbib45
  article-title: The Bellman equation for minimizing the maximum cost
  publication-title: Nonlinear Anal. Theory Methods Appl.
  doi: 10.1016/0362-546X(89)90096-5
  contributor:
    fullname: Barron
– start-page: p 27
  year: 2014
  ident: mstad21cfbib38
  article-title: Weighted importance sampling for off-policy learning with linear function approximation
  contributor:
    fullname: Mahmood
– year: 2020
  ident: mstad21cfbib46
  article-title: Offline reinforcement learning: tutorial, review, and perspectives on open problems
  contributor:
    fullname: Levine
– volume: 127
  start-page: 282
  year: 2019
  ident: mstad21cfbib13
  article-title: Reinforcement learning-overview of recent progress and implications for process control
  publication-title: Comput. Chem. Eng.
  doi: 10.1016/j.compchemeng.2019.05.029
  contributor:
    fullname: Shin
– volume: 20
  start-page: 1038
  year: 2010
  ident: mstad21cfbib16
  article-title: Approximate dynamic programming approach for process control
  publication-title: J. Process Control
  doi: 10.1016/j.jprocont.2010.06.007
  contributor:
    fullname: Lee
– year: 2018
  ident: mstad21cfbib35
  article-title: Soft actor-critic algorithms and applications
  contributor:
    fullname: Haarnoja
– volume: 243
  year: 2021
  ident: mstad21cfbib30
  article-title: Soft actor-critic based multi-objective optimized energy conversion and management strategy for integrated energy systems with renewable energy
  publication-title: Energy Convers. Manage.
  doi: 10.1016/j.enconman.2021.114381
  contributor:
    fullname: Zhang
– volume: 154
  year: 2021
  ident: mstad21cfbib25
  article-title: Deep reinforcement learning control of hydraulic fracturing
  publication-title: Comput. Chem. Eng.
  doi: 10.1016/j.compchemeng.2021.107489
  contributor:
    fullname: Bangi
– volume: 64
  start-page: 4091
  year: 2017
  ident: mstad21cfbib36
  article-title: Data-driven optimal consensus control for discrete-time multi-agent systems with unknown dynamics using reinforcement learning method
  publication-title: IEEE Trans. Ind. Electron.
  doi: 10.1109/TIE.2016.2542134
  contributor:
    fullname: Zhang
– volume: 39
  start-page: 221
  year: 2015
  ident: mstad21cfbib40
  article-title: Batch nonlinear continuous-time trajectory estimation as exactly sparse gaussian process regression
  publication-title: Auton. Robots
  doi: 10.1007/s10514-015-9455-y
  contributor:
    fullname: Anderson
– year: 2014
  ident: mstad21cfbib44
  article-title: Adam: a method for stochastic optimization
  contributor:
    fullname: Kingma
– volume: 34
  year: 2023
  ident: mstad21cfbib15
  article-title: Match-reinforcement learning with time frequency selection for bearing fault diagnosis
  publication-title: Meas. Sci. Technol.
  doi: 10.1088/1361-6501/ace644
  contributor:
    fullname: Wang
– volume: 411
  start-page: 206
  year: 2020
  ident: mstad21cfbib39
  article-title: A TD3-based multi-agent deep reinforcement learning method in mixed cooperation-competition environment
  publication-title: Neurocomputing
  doi: 10.1016/j.neucom.2020.05.097
  contributor:
    fullname: Zhang
– volume: 139
  year: 2020
  ident: mstad21cfbib14
  article-title: A review on reinforcement learning: introduction and applications in industrial process control
  publication-title: Comput. Chem. Eng.
  doi: 10.1016/j.compchemeng.2020.106886
  contributor:
    fullname: Nian
– volume: 75
  start-page: 40
  year: 2019
  ident: mstad21cfbib24
  article-title: Continuous control of a polymerization system with deep reinforcement learning
  publication-title: J. Process Control
  doi: 10.1016/j.jprocont.2018.11.004
  contributor:
    fullname: Ma
– volume: 21
  start-page: 2589
  year: 2021
  ident: mstad21cfbib29
  article-title: Adaptive simplex architecture for safe, real-time robot path planning
  publication-title: Sensors
  doi: 10.3390/s21082589
  contributor:
    fullname: Ionescu
– year: 2019
  ident: mstad21cfbib33
  article-title: Boosting soft actor-critic: emphasizing recent experience without forgetting the past
  contributor:
    fullname: Wang
– volume: 518
  start-page: 529
  year: 2015
  ident: mstad21cfbib17
  article-title: Human-level control through deep reinforcement learning
  publication-title: Nature
  doi: 10.1038/nature14236
  contributor:
    fullname: Mnih
– year: 2015
  ident: mstad21cfbib31
  article-title: Prioritized experience replay
  contributor:
    fullname: Schaul
– volume: vol 325)
  start-page: 321
  year: 2021
  ident: mstad21cfbib21
  article-title: Multi-agent reinforcement learning: a selective overview of theories and algorithms
  doi: 10.1007/978-3-030-60990-0_12
  contributor:
    fullname: Zhang
– volume: 144
  year: 2021
  ident: mstad21cfbib10
  article-title: Reinforcement learning based optimal control of batch processes using Monte-Carlo deep deterministic policy gradient with phase segmentation
  publication-title: Comput. Chem. Eng.
  doi: 10.1016/j.compchemeng.2020.107133
  contributor:
    fullname: Yoo
– volume: 34
  year: 2023
  ident: mstad21cfbib2
  article-title: Deep learning with CBAM-based CNN for batch process quality prediction
  publication-title: Meas. Sci. Technol.
  doi: 10.1088/1361-6501/aceb82
  contributor:
    fullname: Zhao
– volume: 230
  year: 2021
  ident: mstad21cfbib18
  article-title: Reinforcement learning based optimization of process chromatography for continuous processing of biopharmaceuticals
  publication-title: Chem. Eng. Sci.
  doi: 10.1016/j.ces.2020.116171
  contributor:
    fullname: Nikita
– volume: 59
  start-page: 19334
  year: 2020
  ident: mstad21cfbib34
  article-title: A novel dynamic just-in-time learning framework for modeling of batch processes
  publication-title: Ind. Eng. Chem. Res.
  doi: 10.1021/acs.iecr.0c02979
  contributor:
    fullname: Joshi
– volume: 196
  year: 2020
  ident: mstad21cfbib41
  article-title: Wavelet functional principal component analysis for batch process monitoring
  publication-title: Chem. Intell. Lab. Syst.
  doi: 10.1016/j.chemolab.2019.103897
  contributor:
    fullname: Liu
– volume: 60
  start-page: 5504
  year: 2021
  ident: mstad21cfbib11
  article-title: A deep reinforcement learning approach to improve the learning performance in process control
  publication-title: Ind. Eng. Chem. Res.
  doi: 10.1021/acs.iecr.0c05678
  contributor:
    fullname: Bao
– volume: 26
  start-page: 1713
  year: 2018
  ident: mstad21cfbib7
  article-title: Just-in-time learning based integrated MPC-ILC control for batch processes
  publication-title: Chin. J. Chem. Eng.
  doi: 10.1016/j.cjche.2018.06.006
  contributor:
    fullname: Jia
– volume: 31
  year: 2019
  ident: mstad21cfbib8
  article-title: Industrial process deep feature representation by regularization strategy autoencoders for process monitoring
  publication-title: Meas. Sci. Technol.
  doi: 10.1088/1361-6501/ab48c7
  contributor:
    fullname: Kong
– volume: 40
  start-page: 698
  year: 2021
  ident: mstad21cfbib23
  article-title: How to train your robot with deep reinforcement learning: lessons we have learned
  publication-title: Int. J. Robot. Res.
  doi: 10.1177/0278364920987859
  contributor:
    fullname: Ibarz
– volume: 147
  year: 2021
  ident: mstad21cfbib5
  article-title: Mechanistic modeling and parameter-adaptive nonlinear model predictive control of a microbioreacto
  publication-title: Comput. Chem. Eng.
  doi: 10.1016/j.compchemeng.2021.107255
  contributor:
    fullname: Hong
SSID ssj0007099
Score 2.4707344
Snippet Abstract Batch process is difficult to control accurately due to their complex nonlinear dynamics and unstable operating conditions. The traditional methods...
SourceID crossref
SourceType Aggregation Database
StartPage 56202
Title Batch process control based on reinforcement learning with segmented prioritized experience replay
Volume 35
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
link http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwnV3JatxAEG0mDoFcQuzYZKcPPsSYttWbJR0TkzAxTJKDDXMTrV4cQ9AMHg0Gf72rFy1eAnEuQoihGKkeVaWqV08I7bLMKc1zRZyiiogis6TMuCUsZ1ywXDIV2O6zH0fTM3Eyl_PJZDliLa3b-kBfP7hX8j9ehWvgV78l-wjP9kbhApyDf-EIHobjP_n4C8TR3_vLyPXvWec-MRk_BLi0QRZVhw5g932I1Hpd2fMgx-lVAi4WXtnoGs5tL3zshwl_1K2Z72xoJ-5320AeOO297vx8HSf5w5rZLFJy182VvRiaBaFNO100586mBJr6D0wMbL--kZgTyPwxTNkYRvkRJVD70XGcjbIkCU9yFDShBAtb1_fDOYRA31norPm8ZRjVbkhe3cD-Tk7rmYZhxl4UlbdReRtVtPAEPWV5KT0J9PvPX33uzrMyqTPGe0qDbbBw2P-Lw2hhVMiMKpLTl-hFepXAnyMuNtHENlvoWaD06tUW2kxhe4U_JW3xvVeoDpDBCTI4QQYHyOBFg29BBneQwR4yuIcMHkEGD5DBETLb6Ozb19PjKUmf2SAawm9LHNdQo0DdWriiZEpS54lSwqmSK50xJQqdU5O5glrNtaWmZsZI66A2FJY5yXfQRrNo7GuEuZHUCnharoZKt5ClNtJwLyFEpXIie4P2umdWLaOaSvU3D719xG_foecDNN-jjfZybT9AsdjWH4N_bwD_Umr4
link.rule.ids 315,786,790,27955,27956
linkProvider IOP Publishing
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Batch+process+control+based+on+reinforcement+learning+with+segmented+prioritized+experience+replay&rft.jtitle=Measurement+science+%26+technology&rft.au=Xu%2C+Chen&rft.au=Ma%2C+Junwei&rft.au=Tao%2C+Hongfeng&rft.date=2024-05-01&rft.issn=0957-0233&rft.eissn=1361-6501&rft.volume=35&rft.issue=5&rft.spage=56202&rft_id=info:doi/10.1088%2F1361-6501%2Fad21cf&rft.externalDBID=n%2Fa&rft.externalDocID=10_1088_1361_6501_ad21cf
thumbnail_l http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=0957-0233&client=summon
thumbnail_m http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=0957-0233&client=summon
thumbnail_s http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=0957-0233&client=summon