Distributionally Robust Surrogate Optimal Control for High-Dimensional Systems

This article presents a novel methodology for tractably solving optimal control and offline reinforcement learning (RL) problems for high-dimensional systems. This work is motivated by the ongoing challenges of safety, computation, and optimality in high-dimensional optimal control. We address these...

Full description

Saved in:
Bibliographic Details
Published inIEEE transactions on control systems technology Vol. 31; no. 3; pp. 1196 - 1207
Main Authors Kandel, Aaron, Park, Saehong, Moura, Scott J.
Format Journal Article
LanguageEnglish
Published New York IEEE 01.05.2023
The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Subjects
Online AccessGet full text
ISSN1063-6536
1558-0865
DOI10.1109/TCST.2022.3216988

Cover

Loading…
Abstract This article presents a novel methodology for tractably solving optimal control and offline reinforcement learning (RL) problems for high-dimensional systems. This work is motivated by the ongoing challenges of safety, computation, and optimality in high-dimensional optimal control. We address these key questions with the following approach. First, we identify a sequence-modeling surrogate methodology that takes as input the initial state and a time series of control inputs and outputs an approximation of the objective function and trajectories of constraint functions. Importantly this approach entirely absorbs the individual state transition dynamics. The sole dependence on the initial state means we can apply dimensionality reduction to compress the model input while retaining most of its information. Uncertainty in the surrogate objective will affect the resulting optimality. Critically, however, uncertainty in the surrogate constraint functions will lead to infeasibility, i.e., unsafe actions. When considering offline RL, the most significant modeling errors will be encountered on out-of-distribution (OOD) data. Therefore, we apply Wasserstein ambiguity sets to "robustify" our surrogate modeling approach subject to worst case out-of-sample modeling errors based on the distribution of test data residuals. We demonstrate the efficacy of this combined approach through a case study of safe optimal fast charging of a high-dimensional lithium-ion battery model at low temperatures.
AbstractList This article presents a novel methodology for tractably solving optimal control and offline reinforcement learning (RL) problems for high-dimensional systems. This work is motivated by the ongoing challenges of safety, computation, and optimality in high-dimensional optimal control. We address these key questions with the following approach. First, we identify a sequence-modeling surrogate methodology that takes as input the initial state and a time series of control inputs and outputs an approximation of the objective function and trajectories of constraint functions. Importantly this approach entirely absorbs the individual state transition dynamics. The sole dependence on the initial state means we can apply dimensionality reduction to compress the model input while retaining most of its information. Uncertainty in the surrogate objective will affect the resulting optimality. Critically, however, uncertainty in the surrogate constraint functions will lead to infeasibility, i.e., unsafe actions. When considering offline RL, the most significant modeling errors will be encountered on out-of-distribution (OOD) data. Therefore, we apply Wasserstein ambiguity sets to "robustify" our surrogate modeling approach subject to worst case out-of-sample modeling errors based on the distribution of test data residuals. We demonstrate the efficacy of this combined approach through a case study of safe optimal fast charging of a high-dimensional lithium-ion battery model at low temperatures.
Author Park, Saehong
Moura, Scott J.
Kandel, Aaron
Author_xml – sequence: 1
  givenname: Aaron
  orcidid: 0000-0001-6973-7308
  surname: Kandel
  fullname: Kandel, Aaron
  email: aaronkandel@berkeley.edu
  organization: Department of Mechanical Engineering, University of California at Berkeley, Berkeley, CA, USA
– sequence: 2
  givenname: Saehong
  orcidid: 0000-0002-0547-6345
  surname: Park
  fullname: Park, Saehong
  email: sspark@berkeley.edu
  organization: Department of Civil and Environmental Engineering, University of California at Berkeley, Berkeley, CA, USA
– sequence: 3
  givenname: Scott J.
  orcidid: 0000-0002-6393-4375
  surname: Moura
  fullname: Moura, Scott J.
  email: smoura@berkeley.edu
  organization: Department of Civil and Environmental Engineering, University of California at Berkeley, Berkeley, CA, USA
BookMark eNp9kLtOwzAUhi1UJNrCAyCWSMwpvjsZUQoUqaISLbPlOE5xlcbFdoa-PQmtGBiYzhn-71y-CRi1rjUA3CI4QwjmD5tivZlhiPGMYMTzLLsAY8RYlsKMs1HfQ05Szgi_ApMQdhAiyrAYg7e5DdHbsovWtappjsm7K7sQk3XnvduqaJLVIdq9apLCtdG7JqmdTxZ2-5nO7d604YdL1scQzT5cg8taNcHcnOsUfDw_bYpFuly9vBaPy1QTwmOKFNHaoBqrSgjBTEbLmuNSqRLnhAguKkMqUVEtBFUUmpxVgsAKcqGRorokU3B_mnvw7qszIcqd63x_SJA4gxxjzDLap9Appb0LwZtaHnz_ij9KBOWgTQ7a5KBNnrX1jPjDaBvVYCd6ZZt_ybsTaY0xv5vynELGEfkGtNp9pg
CODEN IETTE2
CitedBy_id crossref_primary_10_1109_TCST_2023_3324869
crossref_primary_10_1109_TSG_2024_3371221
Cites_doi 10.1109/CDC.2015.7402829
10.1016/j.paerosci.2005.02.001
10.1007/978-3-642-55508-4_1
10.1023/A:1008306431147
10.1016/j.orl.2018.01.011
10.1149/1.3414012
10.1007/0-306-47508-1_13
10.1287/opre.1050.0216
10.3182/20120711-3-BE-2027.00136
10.1016/j.compchemeng.2005.02.036
10.1007/s10107-017-1172-1
10.1007/s11071-005-2803-2
10.1109/TTE.2022.3140316
10.1109/TMECH.2014.2379695
10.1007/978-3-540-49774-5_14
10.1016/B978-0-444-53859-8.00003-5
10.1109/ICRA.2018.8463189
10.1007/s10107-015-0929-7
10.1016/j.est.2015.10.004
10.1109/TPWRS.2018.2807623
10.23919/ACC45564.2020.9147350
10.1109/TIE.2016.2523440
10.1017/CBO9781107279667
10.1109/ACC.2013.6580670
10.1149/1.3519059
ContentType Journal Article
Copyright Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2023
Copyright_xml – notice: Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2023
DBID 97E
ESBDL
RIA
RIE
AAYXX
CITATION
7SP
7TB
8FD
FR3
L7M
DOI 10.1109/TCST.2022.3216988
DatabaseName IEEE Xplore (IEEE)
IEEE Xplore Open Access (Activated by CARLI)
IEEE All-Society Periodicals Package (ASPP) 1998–Present
IEEE Electronic Library (IEL)
CrossRef
Electronics & Communications Abstracts
Mechanical & Transportation Engineering Abstracts
Technology Research Database
Engineering Research Database
Advanced Technologies Database with Aerospace
DatabaseTitle CrossRef
Engineering Research Database
Technology Research Database
Mechanical & Transportation Engineering Abstracts
Advanced Technologies Database with Aerospace
Electronics & Communications Abstracts
DatabaseTitleList
Engineering Research Database
Database_xml – sequence: 1
  dbid: RIE
  name: IEEE Electronic Library (IEL)
  url: https://proxy.k.utb.cz/login?url=https://ieeexplore.ieee.org/
  sourceTypes: Publisher
DeliveryMethod fulltext_linktorsrc
Discipline Engineering
EISSN 1558-0865
EndPage 1207
ExternalDocumentID 10_1109_TCST_2022_3216988
9940561
Genre orig-research
GrantInformation_xml – fundername: National Science Foundation Graduate Research Fellowship
  funderid: 10.13039/100000001
– fundername: LG Chem Ltd.
  funderid: 10.13039/501100020430
GroupedDBID -~X
.DC
0R~
29I
4.4
5GY
5VS
6IK
97E
AAJGR
AARMG
AASAJ
AAWTH
ABAZT
ABQJQ
ABVLG
ACBEA
ACGFO
ACGFS
ACIWK
ACKIV
AENEX
AETIX
AGQYO
AGSQL
AHBIQ
AI.
AIBXA
AKJIK
AKQYR
ALLEH
ALMA_UNASSIGNED_HOLDINGS
ATWAV
BEFXN
BFFAM
BGNUA
BKEBE
BPEOZ
CS3
DU5
EBS
EJD
ESBDL
HZ~
H~9
ICLAB
IFIPE
IFJZH
IPLJI
JAVBF
LAI
M43
O9-
OCL
P2P
RIA
RIE
RNS
TN5
VH1
AAYOK
AAYXX
CITATION
RIG
7SP
7TB
8FD
FR3
L7M
ID FETCH-LOGICAL-c336t-1a3cce1f2ad7775e84bf62baab2933767de3d7d4c774a40e95d730d067c1a4cb3
IEDL.DBID RIE
ISSN 1063-6536
IngestDate Mon Jun 30 05:56:13 EDT 2025
Thu Apr 24 23:12:03 EDT 2025
Tue Jul 01 02:36:06 EDT 2025
Wed Aug 27 02:49:23 EDT 2025
IsDoiOpenAccess true
IsOpenAccess true
IsPeerReviewed true
IsScholarly true
Issue 3
Language English
License https://creativecommons.org/licenses/by/4.0/legalcode
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-c336t-1a3cce1f2ad7775e84bf62baab2933767de3d7d4c774a40e95d730d067c1a4cb3
Notes ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
ORCID 0000-0002-0547-6345
0000-0002-6393-4375
0000-0001-6973-7308
OpenAccessLink https://proxy.k.utb.cz/login?url=https://ieeexplore.ieee.org/document/9940561
PQID 2806222584
PQPubID 85425
PageCount 12
ParticipantIDs crossref_primary_10_1109_TCST_2022_3216988
crossref_citationtrail_10_1109_TCST_2022_3216988
ieee_primary_9940561
proquest_journals_2806222584
ProviderPackageCode CITATION
AAYXX
PublicationCentury 2000
PublicationDate 2023-May
2023-5-00
20230501
PublicationDateYYYYMMDD 2023-05-01
PublicationDate_xml – month: 05
  year: 2023
  text: 2023-May
PublicationDecade 2020
PublicationPlace New York
PublicationPlace_xml – name: New York
PublicationTitle IEEE transactions on control systems technology
PublicationTitleAbbrev TCST
PublicationYear 2023
Publisher IEEE
The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Publisher_xml – name: IEEE
– name: The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
References ref34
ref37
ref36
ref31
ref30
ref32
chen (ref20) 2019
mania (ref35) 2018
kirk (ref1) 1970
ref2
ref17
ref39
ref16
ref38
nair (ref14) 2020
ref19
ref18
park (ref43) 2020
rahn (ref29) 2012
amos (ref33) 2017
bertsekas (ref10) 2017; 1
landolfi (ref23) 2019
garcía (ref12) 2015; 16
yu (ref44) 2020
canon (ref6) 1970
ref26
ref25
ref42
bertsekas (ref11) 1996; 5
ref21
hespanha (ref4) 2009
kandel (ref28) 2020
ref27
botev (ref41) 2013; 31
ref8
ref7
ref9
ref3
ref5
ray (ref13) 2020
moerland (ref24) 2020
ref40
kaiser (ref22) 2019
kumar (ref15) 2020; 33
References_xml – ident: ref30
  doi: 10.1109/CDC.2015.7402829
– ident: ref17
  doi: 10.1016/j.paerosci.2005.02.001
– ident: ref2
  doi: 10.1007/978-3-642-55508-4_1
– ident: ref18
  doi: 10.1023/A:1008306431147
– year: 2019
  ident: ref22
  article-title: Model-based reinforcement learning for atari
  publication-title: arXiv 1903 00374
– ident: ref37
  doi: 10.1016/j.orl.2018.01.011
– year: 2020
  ident: ref28
  article-title: Safe Wasserstein constrained deep Q-learning
  publication-title: arXiv 2002 03016
– ident: ref7
  doi: 10.1149/1.3414012
– ident: ref40
  doi: 10.1007/0-306-47508-1_13
– start-page: 146
  year: 2017
  ident: ref33
  article-title: Input convex neural networks
  publication-title: Proc Int Conf Mach Learn (ICML)
– ident: ref36
  doi: 10.1287/opre.1050.0216
– year: 2020
  ident: ref24
  article-title: Model-based reinforcement learning: A survey
  publication-title: arXiv 2006 16712
– year: 2012
  ident: ref29
  publication-title: Battery Systems Engineering
– volume: 5
  year: 1996
  ident: ref11
  publication-title: Neuro-Dynamic Programming
– ident: ref19
  doi: 10.3182/20120711-3-BE-2027.00136
– year: 2018
  ident: ref35
  article-title: Simple random search provides a competitive approach to reinforcement learning
  publication-title: arXiv 1803 07055
– ident: ref9
  doi: 10.1016/j.compchemeng.2005.02.036
– ident: ref26
  doi: 10.1007/s10107-017-1172-1
– ident: ref3
  doi: 10.1007/s11071-005-2803-2
– start-page: 1
  year: 2019
  ident: ref20
  article-title: Optimal control via neural networks: A convex approach
  publication-title: Proc Int Conf Learn Represent (ICLR)
– ident: ref42
  doi: 10.1109/TTE.2022.3140316
– year: 1970
  ident: ref6
  publication-title: Theory of Optimal Control and Mathematical Programming
– ident: ref27
  doi: 10.1109/TMECH.2014.2379695
– year: 2009
  ident: ref4
  publication-title: Linear Systems Theory
– ident: ref16
  doi: 10.1007/978-3-540-49774-5_14
– year: 2019
  ident: ref23
  article-title: A model-based approach for sample-efficient multi-task reinforcement learning
  publication-title: arXiv 1907 04964
– volume: 31
  start-page: 35
  year: 2013
  ident: ref41
  article-title: The cross-entropy method for optimization
  publication-title: Handbook of Statistics
  doi: 10.1016/B978-0-444-53859-8.00003-5
– year: 2020
  ident: ref14
  article-title: AWAC: Accelerating online reinforcement learning with offline datasets
  publication-title: arXiv 2006 09359
– volume: 1
  year: 2017
  ident: ref10
  publication-title: Dynamic Programming and Optimal Control
– year: 1970
  ident: ref1
  publication-title: Optimal Control Theory
– year: 2020
  ident: ref44
  article-title: MOPO: Model-based offline policy optimization
  publication-title: arXiv 2005 13239
– ident: ref21
  doi: 10.1109/ICRA.2018.8463189
– ident: ref25
  doi: 10.1007/s10107-015-0929-7
– ident: ref8
  doi: 10.1016/j.est.2015.10.004
– year: 2020
  ident: ref13
  article-title: Benchmarking safe exploration in deep reinforcement learning
  publication-title: arXiv 1910 01708
– volume: 16
  start-page: 1437
  year: 2015
  ident: ref12
  article-title: A comprehensive survey on safe reinforcement learning
  publication-title: J Mach Learn Res
– ident: ref38
  doi: 10.1109/TPWRS.2018.2807623
– start-page: 3506
  year: 2020
  ident: ref43
  article-title: Optimal control of battery fast charging based-on Pontryagin's minimum principle
  publication-title: Proc 59th IEEE Conf Decis Control (CDC)
– ident: ref32
  doi: 10.23919/ACC45564.2020.9147350
– ident: ref39
  doi: 10.1109/TIE.2016.2523440
– volume: 33
  start-page: 1179
  year: 2020
  ident: ref15
  article-title: Conservative Q-learning for offline reinforcement learning
  publication-title: Proc Adv Neural Inf Process Syst
– ident: ref34
  doi: 10.1017/CBO9781107279667
– ident: ref5
  doi: 10.1109/ACC.2013.6580670
– ident: ref31
  doi: 10.1149/1.3519059
SSID ssj0014527
Score 2.4062288
Snippet This article presents a novel methodology for tractably solving optimal control and offline reinforcement learning (RL) problems for high-dimensional systems....
SourceID proquest
crossref
ieee
SourceType Aggregation Database
Enrichment Source
Index Database
Publisher
StartPage 1196
SubjectTerms Adaptation models
Computational modeling
Data models
Errors
High-dimensional control
Lithium-ion batteries
lithium-ion battery
Low temperature
Modelling
nonlinear control
Optimal control
Optimization
Rechargeable batteries
reinforcement learning (RL)
Robust control
robust optimization
Safety
Time series analysis
Uncertainty
Title Distributionally Robust Surrogate Optimal Control for High-Dimensional Systems
URI https://ieeexplore.ieee.org/document/9940561
https://www.proquest.com/docview/2806222584
Volume 31
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
link http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV3PT8IwFH4BTnrwFxqnaHbwZBywtSvsaEBDTMBEIOG29NcuIhjYDvrX29cVNGqMtx26penr2u_re_0-gCuSRTojNAlIptHCjLUDEcluwEPJpFKMhvai8HDEBlP6MItnFbjZ3oXRWtviM93ER5vLV0tZ4FFZK0koAt4qVA1xK-9qbTMGtLRnNQyHBMymJD2np9ma9MYTwwSjqEmikCXWZOVzD7KmKj9WYru93O_DcNOxsqrkuVnkoinfv2k2_rfnB7DncKZ_W06MQ6joxRHsflEfrMOoj6K5zu-Kz-dv_tNSFOvcHxer1RKP1_xHs6C8mM_0yoJ23yBcHytDgj6aApSCHr4TPT-G6f3dpDcInL1CIAlheRByIqUOs4irTqcT6y4VGYsE58JAABR5UZqojqLSIERO2zqJlVkOlNneZMipFOQEaovlQp-CT7QhYjHlsWTE8EuD2q0QGKa1tegK6UF7M-CpdNrjaIExTy0HaScpxijFGKUuRh5cb195LYU3_mpcxzHfNnTD7UFjE9XU_ZrrFFPJSHK79Oz3t85hBz3ly6rGBtTyVaEvDPLIxaWdch8Lr9Su
linkProvider IEEE
linkToHtml http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV09T8MwED2VMgAD34hCgQxMiJQmdtxmRC2oQFskWiS2yF9ZKC1qkwF-PT7HLQgQYstgR5bP9r3znd8DOCVpqFNCY5-kGiXMWN0XoWz6PJBMKsVoYB8K9_qs80hvn6KnEpwv3sJorW3xma7hp83lq4nM8arsIo4pAt4lWDZ-n8bFa61FzoAWAq0mxiE-s0nJimPUvBi2BkMTC4ZhjYQBi63MyqcXsrIqP85i62CuN6A3H1pRV_JcyzNRk-_fWBv_O_ZNWHdI07sslsYWlPR4G9a-8A_uQL-NtLlO8YqPRm_ew0Tks8wb5NPpBC_YvHtzpLyY37SKknbPYFwPa0P8NsoCFJQenqM934XH66thq-M7gQVfEsIyP-BESh2kIVeNRiPSTSpSFgrOhQEBSPOiNFENRaXBiJzWdRwpcyAo4-BkwKkUZA_K48lY74NHtAnFIsojyYiJMA1ut1RgmNjWoilkBerzCU-kYx9HEYxRYqOQepygjRK0UeJsVIGzRZfXgnrjr8Y7OOeLhm66K1CdWzVxm3OWYDIZw9wmPfi91wmsdIa9btK96d8dwioqzBc1jlUoZ9NcHxkckolju_w-AGCo1_4
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Distributionally+Robust+Surrogate+Optimal+Control+for+High-Dimensional+Systems&rft.jtitle=IEEE+transactions+on+control+systems+technology&rft.au=Kandel%2C+Aaron&rft.au=Park%2C+Saehong&rft.au=Moura%2C+Scott+J&rft.date=2023-05-01&rft.pub=The+Institute+of+Electrical+and+Electronics+Engineers%2C+Inc.+%28IEEE%29&rft.issn=1063-6536&rft.eissn=1558-0865&rft.volume=31&rft.issue=3&rft.spage=1196&rft_id=info:doi/10.1109%2FTCST.2022.3216988&rft.externalDBID=NO_FULL_TEXT
thumbnail_l http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=1063-6536&client=summon
thumbnail_m http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=1063-6536&client=summon
thumbnail_s http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=1063-6536&client=summon