Distributionally Robust Surrogate Optimal Control for High-Dimensional Systems

This article presents a novel methodology for tractably solving optimal control and offline reinforcement learning (RL) problems for high-dimensional systems. This work is motivated by the ongoing challenges of safety, computation, and optimality in high-dimensional optimal control. We address these...

Full description

Saved in:

Bibliographic Details
Published in	IEEE transactions on control systems technology Vol. 31; no. 3; pp. 1196 - 1207
Main Authors	Kandel, Aaron, Park, Saehong, Moura, Scott J.
Format	Journal Article
Language	English
Published	New York IEEE 01.05.2023 The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Subjects	Adaptation models Computational modeling Data models Errors High-dimensional control Lithium-ion batteries lithium-ion battery Low temperature Modelling nonlinear control Optimal control Optimization Rechargeable batteries reinforcement learning (RL) Robust control robust optimization Safety Time series analysis Uncertainty
Online Access	Get full text
ISSN	1063-6536 1558-0865
DOI	10.1109/TCST.2022.3216988

Cover

Loading…

Abstract	This article presents a novel methodology for tractably solving optimal control and offline reinforcement learning (RL) problems for high-dimensional systems. This work is motivated by the ongoing challenges of safety, computation, and optimality in high-dimensional optimal control. We address these key questions with the following approach. First, we identify a sequence-modeling surrogate methodology that takes as input the initial state and a time series of control inputs and outputs an approximation of the objective function and trajectories of constraint functions. Importantly this approach entirely absorbs the individual state transition dynamics. The sole dependence on the initial state means we can apply dimensionality reduction to compress the model input while retaining most of its information. Uncertainty in the surrogate objective will affect the resulting optimality. Critically, however, uncertainty in the surrogate constraint functions will lead to infeasibility, i.e., unsafe actions. When considering offline RL, the most significant modeling errors will be encountered on out-of-distribution (OOD) data. Therefore, we apply Wasserstein ambiguity sets to "robustify" our surrogate modeling approach subject to worst case out-of-sample modeling errors based on the distribution of test data residuals. We demonstrate the efficacy of this combined approach through a case study of safe optimal fast charging of a high-dimensional lithium-ion battery model at low temperatures.
AbstractList	This article presents a novel methodology for tractably solving optimal control and offline reinforcement learning (RL) problems for high-dimensional systems. This work is motivated by the ongoing challenges of safety, computation, and optimality in high-dimensional optimal control. We address these key questions with the following approach. First, we identify a sequence-modeling surrogate methodology that takes as input the initial state and a time series of control inputs and outputs an approximation of the objective function and trajectories of constraint functions. Importantly this approach entirely absorbs the individual state transition dynamics. The sole dependence on the initial state means we can apply dimensionality reduction to compress the model input while retaining most of its information. Uncertainty in the surrogate objective will affect the resulting optimality. Critically, however, uncertainty in the surrogate constraint functions will lead to infeasibility, i.e., unsafe actions. When considering offline RL, the most significant modeling errors will be encountered on out-of-distribution (OOD) data. Therefore, we apply Wasserstein ambiguity sets to "robustify" our surrogate modeling approach subject to worst case out-of-sample modeling errors based on the distribution of test data residuals. We demonstrate the efficacy of this combined approach through a case study of safe optimal fast charging of a high-dimensional lithium-ion battery model at low temperatures.
Author	Park, Saehong Moura, Scott J. Kandel, Aaron
Author_xml	– sequence: 1 givenname: Aaron orcidid: 0000-0001-6973-7308 surname: Kandel fullname: Kandel, Aaron email: aaronkandel@berkeley.edu organization: Department of Mechanical Engineering, University of California at Berkeley, Berkeley, CA, USA – sequence: 2 givenname: Saehong orcidid: 0000-0002-0547-6345 surname: Park fullname: Park, Saehong email: sspark@berkeley.edu organization: Department of Civil and Environmental Engineering, University of California at Berkeley, Berkeley, CA, USA – sequence: 3 givenname: Scott J. orcidid: 0000-0002-6393-4375 surname: Moura fullname: Moura, Scott J. email: smoura@berkeley.edu organization: Department of Civil and Environmental Engineering, University of California at Berkeley, Berkeley, CA, USA
BookMark	eNp9kLtOwzAUhi1UJNrCAyCWSMwpvjsZUQoUqaISLbPlOE5xlcbFdoa-PQmtGBiYzhn-71y-CRi1rjUA3CI4QwjmD5tivZlhiPGMYMTzLLsAY8RYlsKMs1HfQ05Szgi_ApMQdhAiyrAYg7e5DdHbsovWtappjsm7K7sQk3XnvduqaJLVIdq9apLCtdG7JqmdTxZ2-5nO7d604YdL1scQzT5cg8taNcHcnOsUfDw_bYpFuly9vBaPy1QTwmOKFNHaoBqrSgjBTEbLmuNSqRLnhAguKkMqUVEtBFUUmpxVgsAKcqGRorokU3B_mnvw7qszIcqd63x_SJA4gxxjzDLap9Appb0LwZtaHnz_ij9KBOWgTQ7a5KBNnrX1jPjDaBvVYCd6ZZt_ybsTaY0xv5vynELGEfkGtNp9pg
CODEN	IETTE2
CitedBy_id	crossref_primary_10_1109_TCST_2023_3324869 crossref_primary_10_1109_TSG_2024_3371221
Cites_doi	10.1109/CDC.2015.7402829 10.1016/j.paerosci.2005.02.001 10.1007/978-3-642-55508-4_1 10.1023/A:1008306431147 10.1016/j.orl.2018.01.011 10.1149/1.3414012 10.1007/0-306-47508-1_13 10.1287/opre.1050.0216 10.3182/20120711-3-BE-2027.00136 10.1016/j.compchemeng.2005.02.036 10.1007/s10107-017-1172-1 10.1007/s11071-005-2803-2 10.1109/TTE.2022.3140316 10.1109/TMECH.2014.2379695 10.1007/978-3-540-49774-5_14 10.1016/B978-0-444-53859-8.00003-5 10.1109/ICRA.2018.8463189 10.1007/s10107-015-0929-7 10.1016/j.est.2015.10.004 10.1109/TPWRS.2018.2807623 10.23919/ACC45564.2020.9147350 10.1109/TIE.2016.2523440 10.1017/CBO9781107279667 10.1109/ACC.2013.6580670 10.1149/1.3519059
ContentType	Journal Article
Copyright	Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2023
Copyright_xml	– notice: Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2023
DBID	97E ESBDL RIA RIE AAYXX CITATION 7SP 7TB 8FD FR3 L7M
DOI	10.1109/TCST.2022.3216988
DatabaseName	IEEE Xplore (IEEE) IEEE Xplore Open Access (Activated by CARLI) IEEE All-Society Periodicals Package (ASPP) 1998–Present IEEE Electronic Library (IEL) CrossRef Electronics & Communications Abstracts Mechanical & Transportation Engineering Abstracts Technology Research Database Engineering Research Database Advanced Technologies Database with Aerospace
DatabaseTitle	CrossRef Engineering Research Database Technology Research Database Mechanical & Transportation Engineering Abstracts Advanced Technologies Database with Aerospace Electronics & Communications Abstracts
DatabaseTitleList	Engineering Research Database
Database_xml	– sequence: 1 dbid: RIE name: IEEE Electronic Library (IEL) url: https://proxy.k.utb.cz/login?url=https://ieeexplore.ieee.org/ sourceTypes: Publisher
DeliveryMethod	fulltext_linktorsrc
Discipline	Engineering
EISSN	1558-0865
EndPage	1207
ExternalDocumentID	10_1109_TCST_2022_3216988 9940561
Genre	orig-research
GrantInformation_xml	– fundername: National Science Foundation Graduate Research Fellowship funderid: 10.13039/100000001 – fundername: LG Chem Ltd. funderid: 10.13039/501100020430
GroupedDBID	-~X .DC 0R~ 29I 4.4 5GY 5VS 6IK 97E AAJGR AARMG AASAJ AAWTH ABAZT ABQJQ ABVLG ACBEA ACGFO ACGFS ACIWK ACKIV AENEX AETIX AGQYO AGSQL AHBIQ AI. AIBXA AKJIK AKQYR ALLEH ALMA_UNASSIGNED_HOLDINGS ATWAV BEFXN BFFAM BGNUA BKEBE BPEOZ CS3 DU5 EBS EJD ESBDL HZ~ H~9 ICLAB IFIPE IFJZH IPLJI JAVBF LAI M43 O9- OCL P2P RIA RIE RNS TN5 VH1 AAYOK AAYXX CITATION RIG 7SP 7TB 8FD FR3 L7M
ID	FETCH-LOGICAL-c336t-1a3cce1f2ad7775e84bf62baab2933767de3d7d4c774a40e95d730d067c1a4cb3
IEDL.DBID	RIE
ISSN	1063-6536
IngestDate	Mon Jun 30 05:56:13 EDT 2025 Thu Apr 24 23:12:03 EDT 2025 Tue Jul 01 02:36:06 EDT 2025 Wed Aug 27 02:49:23 EDT 2025
IsDoiOpenAccess	true
IsOpenAccess	true
IsPeerReviewed	true
IsScholarly	true
Issue	3
Language	English
License	https://creativecommons.org/licenses/by/4.0/legalcode
LinkModel	DirectLink
MergedId	FETCHMERGED-LOGICAL-c336t-1a3cce1f2ad7775e84bf62baab2933767de3d7d4c774a40e95d730d067c1a4cb3
Notes	ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14
ORCID	0000-0002-0547-6345 0000-0002-6393-4375 0000-0001-6973-7308
OpenAccessLink	https://proxy.k.utb.cz/login?url=https://ieeexplore.ieee.org/document/9940561
PQID	2806222584
PQPubID	85425
PageCount	12
ParticipantIDs	crossref_primary_10_1109_TCST_2022_3216988 crossref_citationtrail_10_1109_TCST_2022_3216988 ieee_primary_9940561 proquest_journals_2806222584
ProviderPackageCode	CITATION AAYXX
PublicationCentury	2000
PublicationDate	2023-May 2023-5-00 20230501
PublicationDateYYYYMMDD	2023-05-01
PublicationDate_xml	– month: 05 year: 2023 text: 2023-May
PublicationDecade	2020
PublicationPlace	New York
PublicationPlace_xml	– name: New York
PublicationTitle	IEEE transactions on control systems technology
PublicationTitleAbbrev	TCST
PublicationYear	2023
Publisher	IEEE The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Publisher_xml	– name: IEEE – name: The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
References	ref34 ref37 ref36 ref31 ref30 ref32 chen (ref20) 2019 mania (ref35) 2018 kirk (ref1) 1970 ref2 ref17 ref39 ref16 ref38 nair (ref14) 2020 ref19 ref18 park (ref43) 2020 rahn (ref29) 2012 amos (ref33) 2017 bertsekas (ref10) 2017; 1 landolfi (ref23) 2019 garcía (ref12) 2015; 16 yu (ref44) 2020 canon (ref6) 1970 ref26 ref25 ref42 bertsekas (ref11) 1996; 5 ref21 hespanha (ref4) 2009 kandel (ref28) 2020 ref27 botev (ref41) 2013; 31 ref8 ref7 ref9 ref3 ref5 ray (ref13) 2020 moerland (ref24) 2020 ref40 kaiser (ref22) 2019 kumar (ref15) 2020; 33
References_xml	– ident: ref30 doi: 10.1109/CDC.2015.7402829 – ident: ref17 doi: 10.1016/j.paerosci.2005.02.001 – ident: ref2 doi: 10.1007/978-3-642-55508-4_1 – ident: ref18 doi: 10.1023/A:1008306431147 – year: 2019 ident: ref22 article-title: Model-based reinforcement learning for atari publication-title: arXiv 1903 00374 – ident: ref37 doi: 10.1016/j.orl.2018.01.011 – year: 2020 ident: ref28 article-title: Safe Wasserstein constrained deep Q-learning publication-title: arXiv 2002 03016 – ident: ref7 doi: 10.1149/1.3414012 – ident: ref40 doi: 10.1007/0-306-47508-1_13 – start-page: 146 year: 2017 ident: ref33 article-title: Input convex neural networks publication-title: Proc Int Conf Mach Learn (ICML) – ident: ref36 doi: 10.1287/opre.1050.0216 – year: 2020 ident: ref24 article-title: Model-based reinforcement learning: A survey publication-title: arXiv 2006 16712 – year: 2012 ident: ref29 publication-title: Battery Systems Engineering – volume: 5 year: 1996 ident: ref11 publication-title: Neuro-Dynamic Programming – ident: ref19 doi: 10.3182/20120711-3-BE-2027.00136 – year: 2018 ident: ref35 article-title: Simple random search provides a competitive approach to reinforcement learning publication-title: arXiv 1803 07055 – ident: ref9 doi: 10.1016/j.compchemeng.2005.02.036 – ident: ref26 doi: 10.1007/s10107-017-1172-1 – ident: ref3 doi: 10.1007/s11071-005-2803-2 – start-page: 1 year: 2019 ident: ref20 article-title: Optimal control via neural networks: A convex approach publication-title: Proc Int Conf Learn Represent (ICLR) – ident: ref42 doi: 10.1109/TTE.2022.3140316 – year: 1970 ident: ref6 publication-title: Theory of Optimal Control and Mathematical Programming – ident: ref27 doi: 10.1109/TMECH.2014.2379695 – year: 2009 ident: ref4 publication-title: Linear Systems Theory – ident: ref16 doi: 10.1007/978-3-540-49774-5_14 – year: 2019 ident: ref23 article-title: A model-based approach for sample-efficient multi-task reinforcement learning publication-title: arXiv 1907 04964 – volume: 31 start-page: 35 year: 2013 ident: ref41 article-title: The cross-entropy method for optimization publication-title: Handbook of Statistics doi: 10.1016/B978-0-444-53859-8.00003-5 – year: 2020 ident: ref14 article-title: AWAC: Accelerating online reinforcement learning with offline datasets publication-title: arXiv 2006 09359 – volume: 1 year: 2017 ident: ref10 publication-title: Dynamic Programming and Optimal Control – year: 1970 ident: ref1 publication-title: Optimal Control Theory – year: 2020 ident: ref44 article-title: MOPO: Model-based offline policy optimization publication-title: arXiv 2005 13239 – ident: ref21 doi: 10.1109/ICRA.2018.8463189 – ident: ref25 doi: 10.1007/s10107-015-0929-7 – ident: ref8 doi: 10.1016/j.est.2015.10.004 – year: 2020 ident: ref13 article-title: Benchmarking safe exploration in deep reinforcement learning publication-title: arXiv 1910 01708 – volume: 16 start-page: 1437 year: 2015 ident: ref12 article-title: A comprehensive survey on safe reinforcement learning publication-title: J Mach Learn Res – ident: ref38 doi: 10.1109/TPWRS.2018.2807623 – start-page: 3506 year: 2020 ident: ref43 article-title: Optimal control of battery fast charging based-on Pontryagin's minimum principle publication-title: Proc 59th IEEE Conf Decis Control (CDC) – ident: ref32 doi: 10.23919/ACC45564.2020.9147350 – ident: ref39 doi: 10.1109/TIE.2016.2523440 – volume: 33 start-page: 1179 year: 2020 ident: ref15 article-title: Conservative Q-learning for offline reinforcement learning publication-title: Proc Adv Neural Inf Process Syst – ident: ref34 doi: 10.1017/CBO9781107279667 – ident: ref5 doi: 10.1109/ACC.2013.6580670 – ident: ref31 doi: 10.1149/1.3519059
SSID	ssj0014527
Score	2.4062288
Snippet	This article presents a novel methodology for tractably solving optimal control and offline reinforcement learning (RL) problems for high-dimensional systems....
SourceID	proquest crossref ieee
SourceType	Aggregation Database Enrichment Source Index Database Publisher
StartPage	1196
SubjectTerms	Adaptation models Computational modeling Data models Errors High-dimensional control Lithium-ion batteries lithium-ion battery Low temperature Modelling nonlinear control Optimal control Optimization Rechargeable batteries reinforcement learning (RL) Robust control robust optimization Safety Time series analysis Uncertainty
Title	Distributionally Robust Surrogate Optimal Control for High-Dimensional Systems
URI	https://ieeexplore.ieee.org/document/9940561 https://www.proquest.com/docview/2806222584
Volume	31
hasFullText	1
inHoldings	1
isFullTextHit
isPrint
link	http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV3PT8IwFH4BTnrwFxqnaHbwZBywtSvsaEBDTMBEIOG29NcuIhjYDvrX29cVNGqMtx26penr2u_re_0-gCuSRTojNAlIptHCjLUDEcluwEPJpFKMhvai8HDEBlP6MItnFbjZ3oXRWtviM93ER5vLV0tZ4FFZK0koAt4qVA1xK-9qbTMGtLRnNQyHBMymJD2np9ma9MYTwwSjqEmikCXWZOVzD7KmKj9WYru93O_DcNOxsqrkuVnkoinfv2k2_rfnB7DncKZ_W06MQ6joxRHsflEfrMOoj6K5zu-Kz-dv_tNSFOvcHxer1RKP1_xHs6C8mM_0yoJ23yBcHytDgj6aApSCHr4TPT-G6f3dpDcInL1CIAlheRByIqUOs4irTqcT6y4VGYsE58JAABR5UZqojqLSIERO2zqJlVkOlNneZMipFOQEaovlQp-CT7QhYjHlsWTE8EuD2q0QGKa1tegK6UF7M-CpdNrjaIExTy0HaScpxijFGKUuRh5cb195LYU3_mpcxzHfNnTD7UFjE9XU_ZrrFFPJSHK79Oz3t85hBz3ly6rGBtTyVaEvDPLIxaWdch8Lr9Su
linkProvider	IEEE
linkToHtml	http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV09T8MwED2VMgAD34hCgQxMiJQmdtxmRC2oQFskWiS2yF9ZKC1qkwF-PT7HLQgQYstgR5bP9r3znd8DOCVpqFNCY5-kGiXMWN0XoWz6PJBMKsVoYB8K9_qs80hvn6KnEpwv3sJorW3xma7hp83lq4nM8arsIo4pAt4lWDZ-n8bFa61FzoAWAq0mxiE-s0nJimPUvBi2BkMTC4ZhjYQBi63MyqcXsrIqP85i62CuN6A3H1pRV_JcyzNRk-_fWBv_O_ZNWHdI07sslsYWlPR4G9a-8A_uQL-NtLlO8YqPRm_ew0Tks8wb5NPpBC_YvHtzpLyY37SKknbPYFwPa0P8NsoCFJQenqM934XH66thq-M7gQVfEsIyP-BESh2kIVeNRiPSTSpSFgrOhQEBSPOiNFENRaXBiJzWdRwpcyAo4-BkwKkUZA_K48lY74NHtAnFIsojyYiJMA1ut1RgmNjWoilkBerzCU-kYx9HEYxRYqOQepygjRK0UeJsVIGzRZfXgnrjr8Y7OOeLhm66K1CdWzVxm3OWYDIZw9wmPfi91wmsdIa9btK96d8dwioqzBc1jlUoZ9NcHxkckolju_w-AGCo1_4
openUrl	ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Distributionally+Robust+Surrogate+Optimal+Control+for+High-Dimensional+Systems&rft.jtitle=IEEE+transactions+on+control+systems+technology&rft.au=Kandel%2C+Aaron&rft.au=Park%2C+Saehong&rft.au=Moura%2C+Scott+J&rft.date=2023-05-01&rft.pub=The+Institute+of+Electrical+and+Electronics+Engineers%2C+Inc.+%28IEEE%29&rft.issn=1063-6536&rft.eissn=1558-0865&rft.volume=31&rft.issue=3&rft.spage=1196&rft_id=info:doi/10.1109%2FTCST.2022.3216988&rft.externalDBID=NO_FULL_TEXT
thumbnail_l	http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=1063-6536&client=summon
thumbnail_m	http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=1063-6536&client=summon
thumbnail_s	http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=1063-6536&client=summon