Distributionally Robust Surrogate Optimal Control for High-Dimensional Systems
This article presents a novel methodology for tractably solving optimal control and offline reinforcement learning (RL) problems for high-dimensional systems. This work is motivated by the ongoing challenges of safety, computation, and optimality in high-dimensional optimal control. We address these...
Saved in:
Published in | IEEE transactions on control systems technology Vol. 31; no. 3; pp. 1196 - 1207 |
---|---|
Main Authors | , , |
Format | Journal Article |
Language | English |
Published |
New York
IEEE
01.05.2023
The Institute of Electrical and Electronics Engineers, Inc. (IEEE) |
Subjects | |
Online Access | Get full text |
ISSN | 1063-6536 1558-0865 |
DOI | 10.1109/TCST.2022.3216988 |
Cover
Loading…
Abstract | This article presents a novel methodology for tractably solving optimal control and offline reinforcement learning (RL) problems for high-dimensional systems. This work is motivated by the ongoing challenges of safety, computation, and optimality in high-dimensional optimal control. We address these key questions with the following approach. First, we identify a sequence-modeling surrogate methodology that takes as input the initial state and a time series of control inputs and outputs an approximation of the objective function and trajectories of constraint functions. Importantly this approach entirely absorbs the individual state transition dynamics. The sole dependence on the initial state means we can apply dimensionality reduction to compress the model input while retaining most of its information. Uncertainty in the surrogate objective will affect the resulting optimality. Critically, however, uncertainty in the surrogate constraint functions will lead to infeasibility, i.e., unsafe actions. When considering offline RL, the most significant modeling errors will be encountered on out-of-distribution (OOD) data. Therefore, we apply Wasserstein ambiguity sets to "robustify" our surrogate modeling approach subject to worst case out-of-sample modeling errors based on the distribution of test data residuals. We demonstrate the efficacy of this combined approach through a case study of safe optimal fast charging of a high-dimensional lithium-ion battery model at low temperatures. |
---|---|
AbstractList | This article presents a novel methodology for tractably solving optimal control and offline reinforcement learning (RL) problems for high-dimensional systems. This work is motivated by the ongoing challenges of safety, computation, and optimality in high-dimensional optimal control. We address these key questions with the following approach. First, we identify a sequence-modeling surrogate methodology that takes as input the initial state and a time series of control inputs and outputs an approximation of the objective function and trajectories of constraint functions. Importantly this approach entirely absorbs the individual state transition dynamics. The sole dependence on the initial state means we can apply dimensionality reduction to compress the model input while retaining most of its information. Uncertainty in the surrogate objective will affect the resulting optimality. Critically, however, uncertainty in the surrogate constraint functions will lead to infeasibility, i.e., unsafe actions. When considering offline RL, the most significant modeling errors will be encountered on out-of-distribution (OOD) data. Therefore, we apply Wasserstein ambiguity sets to "robustify" our surrogate modeling approach subject to worst case out-of-sample modeling errors based on the distribution of test data residuals. We demonstrate the efficacy of this combined approach through a case study of safe optimal fast charging of a high-dimensional lithium-ion battery model at low temperatures. |
Author | Park, Saehong Moura, Scott J. Kandel, Aaron |
Author_xml | – sequence: 1 givenname: Aaron orcidid: 0000-0001-6973-7308 surname: Kandel fullname: Kandel, Aaron email: aaronkandel@berkeley.edu organization: Department of Mechanical Engineering, University of California at Berkeley, Berkeley, CA, USA – sequence: 2 givenname: Saehong orcidid: 0000-0002-0547-6345 surname: Park fullname: Park, Saehong email: sspark@berkeley.edu organization: Department of Civil and Environmental Engineering, University of California at Berkeley, Berkeley, CA, USA – sequence: 3 givenname: Scott J. orcidid: 0000-0002-6393-4375 surname: Moura fullname: Moura, Scott J. email: smoura@berkeley.edu organization: Department of Civil and Environmental Engineering, University of California at Berkeley, Berkeley, CA, USA |
BookMark | eNp9kLtOwzAUhi1UJNrCAyCWSMwpvjsZUQoUqaISLbPlOE5xlcbFdoa-PQmtGBiYzhn-71y-CRi1rjUA3CI4QwjmD5tivZlhiPGMYMTzLLsAY8RYlsKMs1HfQ05Szgi_ApMQdhAiyrAYg7e5DdHbsovWtappjsm7K7sQk3XnvduqaJLVIdq9apLCtdG7JqmdTxZ2-5nO7d604YdL1scQzT5cg8taNcHcnOsUfDw_bYpFuly9vBaPy1QTwmOKFNHaoBqrSgjBTEbLmuNSqRLnhAguKkMqUVEtBFUUmpxVgsAKcqGRorokU3B_mnvw7qszIcqd63x_SJA4gxxjzDLap9Appb0LwZtaHnz_ij9KBOWgTQ7a5KBNnrX1jPjDaBvVYCd6ZZt_ybsTaY0xv5vynELGEfkGtNp9pg |
CODEN | IETTE2 |
CitedBy_id | crossref_primary_10_1109_TCST_2023_3324869 crossref_primary_10_1109_TSG_2024_3371221 |
Cites_doi | 10.1109/CDC.2015.7402829 10.1016/j.paerosci.2005.02.001 10.1007/978-3-642-55508-4_1 10.1023/A:1008306431147 10.1016/j.orl.2018.01.011 10.1149/1.3414012 10.1007/0-306-47508-1_13 10.1287/opre.1050.0216 10.3182/20120711-3-BE-2027.00136 10.1016/j.compchemeng.2005.02.036 10.1007/s10107-017-1172-1 10.1007/s11071-005-2803-2 10.1109/TTE.2022.3140316 10.1109/TMECH.2014.2379695 10.1007/978-3-540-49774-5_14 10.1016/B978-0-444-53859-8.00003-5 10.1109/ICRA.2018.8463189 10.1007/s10107-015-0929-7 10.1016/j.est.2015.10.004 10.1109/TPWRS.2018.2807623 10.23919/ACC45564.2020.9147350 10.1109/TIE.2016.2523440 10.1017/CBO9781107279667 10.1109/ACC.2013.6580670 10.1149/1.3519059 |
ContentType | Journal Article |
Copyright | Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2023 |
Copyright_xml | – notice: Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2023 |
DBID | 97E ESBDL RIA RIE AAYXX CITATION 7SP 7TB 8FD FR3 L7M |
DOI | 10.1109/TCST.2022.3216988 |
DatabaseName | IEEE Xplore (IEEE) IEEE Xplore Open Access (Activated by CARLI) IEEE All-Society Periodicals Package (ASPP) 1998–Present IEEE Electronic Library (IEL) CrossRef Electronics & Communications Abstracts Mechanical & Transportation Engineering Abstracts Technology Research Database Engineering Research Database Advanced Technologies Database with Aerospace |
DatabaseTitle | CrossRef Engineering Research Database Technology Research Database Mechanical & Transportation Engineering Abstracts Advanced Technologies Database with Aerospace Electronics & Communications Abstracts |
DatabaseTitleList | Engineering Research Database |
Database_xml | – sequence: 1 dbid: RIE name: IEEE Electronic Library (IEL) url: https://proxy.k.utb.cz/login?url=https://ieeexplore.ieee.org/ sourceTypes: Publisher |
DeliveryMethod | fulltext_linktorsrc |
Discipline | Engineering |
EISSN | 1558-0865 |
EndPage | 1207 |
ExternalDocumentID | 10_1109_TCST_2022_3216988 9940561 |
Genre | orig-research |
GrantInformation_xml | – fundername: National Science Foundation Graduate Research Fellowship funderid: 10.13039/100000001 – fundername: LG Chem Ltd. funderid: 10.13039/501100020430 |
GroupedDBID | -~X .DC 0R~ 29I 4.4 5GY 5VS 6IK 97E AAJGR AARMG AASAJ AAWTH ABAZT ABQJQ ABVLG ACBEA ACGFO ACGFS ACIWK ACKIV AENEX AETIX AGQYO AGSQL AHBIQ AI. AIBXA AKJIK AKQYR ALLEH ALMA_UNASSIGNED_HOLDINGS ATWAV BEFXN BFFAM BGNUA BKEBE BPEOZ CS3 DU5 EBS EJD ESBDL HZ~ H~9 ICLAB IFIPE IFJZH IPLJI JAVBF LAI M43 O9- OCL P2P RIA RIE RNS TN5 VH1 AAYOK AAYXX CITATION RIG 7SP 7TB 8FD FR3 L7M |
ID | FETCH-LOGICAL-c336t-1a3cce1f2ad7775e84bf62baab2933767de3d7d4c774a40e95d730d067c1a4cb3 |
IEDL.DBID | RIE |
ISSN | 1063-6536 |
IngestDate | Mon Jun 30 05:56:13 EDT 2025 Thu Apr 24 23:12:03 EDT 2025 Tue Jul 01 02:36:06 EDT 2025 Wed Aug 27 02:49:23 EDT 2025 |
IsDoiOpenAccess | true |
IsOpenAccess | true |
IsPeerReviewed | true |
IsScholarly | true |
Issue | 3 |
Language | English |
License | https://creativecommons.org/licenses/by/4.0/legalcode |
LinkModel | DirectLink |
MergedId | FETCHMERGED-LOGICAL-c336t-1a3cce1f2ad7775e84bf62baab2933767de3d7d4c774a40e95d730d067c1a4cb3 |
Notes | ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14 |
ORCID | 0000-0002-0547-6345 0000-0002-6393-4375 0000-0001-6973-7308 |
OpenAccessLink | https://proxy.k.utb.cz/login?url=https://ieeexplore.ieee.org/document/9940561 |
PQID | 2806222584 |
PQPubID | 85425 |
PageCount | 12 |
ParticipantIDs | crossref_primary_10_1109_TCST_2022_3216988 crossref_citationtrail_10_1109_TCST_2022_3216988 ieee_primary_9940561 proquest_journals_2806222584 |
ProviderPackageCode | CITATION AAYXX |
PublicationCentury | 2000 |
PublicationDate | 2023-May 2023-5-00 20230501 |
PublicationDateYYYYMMDD | 2023-05-01 |
PublicationDate_xml | – month: 05 year: 2023 text: 2023-May |
PublicationDecade | 2020 |
PublicationPlace | New York |
PublicationPlace_xml | – name: New York |
PublicationTitle | IEEE transactions on control systems technology |
PublicationTitleAbbrev | TCST |
PublicationYear | 2023 |
Publisher | IEEE The Institute of Electrical and Electronics Engineers, Inc. (IEEE) |
Publisher_xml | – name: IEEE – name: The Institute of Electrical and Electronics Engineers, Inc. (IEEE) |
References | ref34 ref37 ref36 ref31 ref30 ref32 chen (ref20) 2019 mania (ref35) 2018 kirk (ref1) 1970 ref2 ref17 ref39 ref16 ref38 nair (ref14) 2020 ref19 ref18 park (ref43) 2020 rahn (ref29) 2012 amos (ref33) 2017 bertsekas (ref10) 2017; 1 landolfi (ref23) 2019 garcía (ref12) 2015; 16 yu (ref44) 2020 canon (ref6) 1970 ref26 ref25 ref42 bertsekas (ref11) 1996; 5 ref21 hespanha (ref4) 2009 kandel (ref28) 2020 ref27 botev (ref41) 2013; 31 ref8 ref7 ref9 ref3 ref5 ray (ref13) 2020 moerland (ref24) 2020 ref40 kaiser (ref22) 2019 kumar (ref15) 2020; 33 |
References_xml | – ident: ref30 doi: 10.1109/CDC.2015.7402829 – ident: ref17 doi: 10.1016/j.paerosci.2005.02.001 – ident: ref2 doi: 10.1007/978-3-642-55508-4_1 – ident: ref18 doi: 10.1023/A:1008306431147 – year: 2019 ident: ref22 article-title: Model-based reinforcement learning for atari publication-title: arXiv 1903 00374 – ident: ref37 doi: 10.1016/j.orl.2018.01.011 – year: 2020 ident: ref28 article-title: Safe Wasserstein constrained deep Q-learning publication-title: arXiv 2002 03016 – ident: ref7 doi: 10.1149/1.3414012 – ident: ref40 doi: 10.1007/0-306-47508-1_13 – start-page: 146 year: 2017 ident: ref33 article-title: Input convex neural networks publication-title: Proc Int Conf Mach Learn (ICML) – ident: ref36 doi: 10.1287/opre.1050.0216 – year: 2020 ident: ref24 article-title: Model-based reinforcement learning: A survey publication-title: arXiv 2006 16712 – year: 2012 ident: ref29 publication-title: Battery Systems Engineering – volume: 5 year: 1996 ident: ref11 publication-title: Neuro-Dynamic Programming – ident: ref19 doi: 10.3182/20120711-3-BE-2027.00136 – year: 2018 ident: ref35 article-title: Simple random search provides a competitive approach to reinforcement learning publication-title: arXiv 1803 07055 – ident: ref9 doi: 10.1016/j.compchemeng.2005.02.036 – ident: ref26 doi: 10.1007/s10107-017-1172-1 – ident: ref3 doi: 10.1007/s11071-005-2803-2 – start-page: 1 year: 2019 ident: ref20 article-title: Optimal control via neural networks: A convex approach publication-title: Proc Int Conf Learn Represent (ICLR) – ident: ref42 doi: 10.1109/TTE.2022.3140316 – year: 1970 ident: ref6 publication-title: Theory of Optimal Control and Mathematical Programming – ident: ref27 doi: 10.1109/TMECH.2014.2379695 – year: 2009 ident: ref4 publication-title: Linear Systems Theory – ident: ref16 doi: 10.1007/978-3-540-49774-5_14 – year: 2019 ident: ref23 article-title: A model-based approach for sample-efficient multi-task reinforcement learning publication-title: arXiv 1907 04964 – volume: 31 start-page: 35 year: 2013 ident: ref41 article-title: The cross-entropy method for optimization publication-title: Handbook of Statistics doi: 10.1016/B978-0-444-53859-8.00003-5 – year: 2020 ident: ref14 article-title: AWAC: Accelerating online reinforcement learning with offline datasets publication-title: arXiv 2006 09359 – volume: 1 year: 2017 ident: ref10 publication-title: Dynamic Programming and Optimal Control – year: 1970 ident: ref1 publication-title: Optimal Control Theory – year: 2020 ident: ref44 article-title: MOPO: Model-based offline policy optimization publication-title: arXiv 2005 13239 – ident: ref21 doi: 10.1109/ICRA.2018.8463189 – ident: ref25 doi: 10.1007/s10107-015-0929-7 – ident: ref8 doi: 10.1016/j.est.2015.10.004 – year: 2020 ident: ref13 article-title: Benchmarking safe exploration in deep reinforcement learning publication-title: arXiv 1910 01708 – volume: 16 start-page: 1437 year: 2015 ident: ref12 article-title: A comprehensive survey on safe reinforcement learning publication-title: J Mach Learn Res – ident: ref38 doi: 10.1109/TPWRS.2018.2807623 – start-page: 3506 year: 2020 ident: ref43 article-title: Optimal control of battery fast charging based-on Pontryagin's minimum principle publication-title: Proc 59th IEEE Conf Decis Control (CDC) – ident: ref32 doi: 10.23919/ACC45564.2020.9147350 – ident: ref39 doi: 10.1109/TIE.2016.2523440 – volume: 33 start-page: 1179 year: 2020 ident: ref15 article-title: Conservative Q-learning for offline reinforcement learning publication-title: Proc Adv Neural Inf Process Syst – ident: ref34 doi: 10.1017/CBO9781107279667 – ident: ref5 doi: 10.1109/ACC.2013.6580670 – ident: ref31 doi: 10.1149/1.3519059 |
SSID | ssj0014527 |
Score | 2.4062288 |
Snippet | This article presents a novel methodology for tractably solving optimal control and offline reinforcement learning (RL) problems for high-dimensional systems.... |
SourceID | proquest crossref ieee |
SourceType | Aggregation Database Enrichment Source Index Database Publisher |
StartPage | 1196 |
SubjectTerms | Adaptation models Computational modeling Data models Errors High-dimensional control Lithium-ion batteries lithium-ion battery Low temperature Modelling nonlinear control Optimal control Optimization Rechargeable batteries reinforcement learning (RL) Robust control robust optimization Safety Time series analysis Uncertainty |
Title | Distributionally Robust Surrogate Optimal Control for High-Dimensional Systems |
URI | https://ieeexplore.ieee.org/document/9940561 https://www.proquest.com/docview/2806222584 |
Volume | 31 |
hasFullText | 1 |
inHoldings | 1 |
isFullTextHit | |
isPrint | |
link | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV3PT8IwFH4BTnrwFxqnaHbwZBywtSvsaEBDTMBEIOG29NcuIhjYDvrX29cVNGqMtx26penr2u_re_0-gCuSRTojNAlIptHCjLUDEcluwEPJpFKMhvai8HDEBlP6MItnFbjZ3oXRWtviM93ER5vLV0tZ4FFZK0koAt4qVA1xK-9qbTMGtLRnNQyHBMymJD2np9ma9MYTwwSjqEmikCXWZOVzD7KmKj9WYru93O_DcNOxsqrkuVnkoinfv2k2_rfnB7DncKZ_W06MQ6joxRHsflEfrMOoj6K5zu-Kz-dv_tNSFOvcHxer1RKP1_xHs6C8mM_0yoJ23yBcHytDgj6aApSCHr4TPT-G6f3dpDcInL1CIAlheRByIqUOs4irTqcT6y4VGYsE58JAABR5UZqojqLSIERO2zqJlVkOlNneZMipFOQEaovlQp-CT7QhYjHlsWTE8EuD2q0QGKa1tegK6UF7M-CpdNrjaIExTy0HaScpxijFGKUuRh5cb195LYU3_mpcxzHfNnTD7UFjE9XU_ZrrFFPJSHK79Oz3t85hBz3ly6rGBtTyVaEvDPLIxaWdch8Lr9Su |
linkProvider | IEEE |
linkToHtml | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV09T8MwED2VMgAD34hCgQxMiJQmdtxmRC2oQFskWiS2yF9ZKC1qkwF-PT7HLQgQYstgR5bP9r3znd8DOCVpqFNCY5-kGiXMWN0XoWz6PJBMKsVoYB8K9_qs80hvn6KnEpwv3sJorW3xma7hp83lq4nM8arsIo4pAt4lWDZ-n8bFa61FzoAWAq0mxiE-s0nJimPUvBi2BkMTC4ZhjYQBi63MyqcXsrIqP85i62CuN6A3H1pRV_JcyzNRk-_fWBv_O_ZNWHdI07sslsYWlPR4G9a-8A_uQL-NtLlO8YqPRm_ew0Tks8wb5NPpBC_YvHtzpLyY37SKknbPYFwPa0P8NsoCFJQenqM934XH66thq-M7gQVfEsIyP-BESh2kIVeNRiPSTSpSFgrOhQEBSPOiNFENRaXBiJzWdRwpcyAo4-BkwKkUZA_K48lY74NHtAnFIsojyYiJMA1ut1RgmNjWoilkBerzCU-kYx9HEYxRYqOQepygjRK0UeJsVIGzRZfXgnrjr8Y7OOeLhm66K1CdWzVxm3OWYDIZw9wmPfi91wmsdIa9btK96d8dwioqzBc1jlUoZ9NcHxkckolju_w-AGCo1_4 |
openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Distributionally+Robust+Surrogate+Optimal+Control+for+High-Dimensional+Systems&rft.jtitle=IEEE+transactions+on+control+systems+technology&rft.au=Kandel%2C+Aaron&rft.au=Park%2C+Saehong&rft.au=Moura%2C+Scott+J&rft.date=2023-05-01&rft.pub=The+Institute+of+Electrical+and+Electronics+Engineers%2C+Inc.+%28IEEE%29&rft.issn=1063-6536&rft.eissn=1558-0865&rft.volume=31&rft.issue=3&rft.spage=1196&rft_id=info:doi/10.1109%2FTCST.2022.3216988&rft.externalDBID=NO_FULL_TEXT |
thumbnail_l | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=1063-6536&client=summon |
thumbnail_m | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=1063-6536&client=summon |
thumbnail_s | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=1063-6536&client=summon |