Reinforcement Learning for Quantization of Boundary Control Inputs: A Comparison of PPO-based Strategies
This paper investigates the boundary stabilization problem for the Korteweg-de Vries (KdV) system with quantized control inputs via the deep reinforcement learning (DRL) approach. To examine the impact of different placements of the quantizer on stabilization performance, we discuss two scenarios: t...
Saved in:
Published in | Chinese Control Conference pp. 1093 - 1098 |
---|---|
Main Authors | , |
Format | Conference Proceeding |
Language | English |
Published |
Technical Committee on Control Theory, Chinese Association of Automation
28.07.2024
|
Subjects | |
Online Access | Get full text |
ISSN | 1934-1768 |
DOI | 10.23919/CCC63176.2024.10661946 |
Cover
Loading…
Abstract | This paper investigates the boundary stabilization problem for the Korteweg-de Vries (KdV) system with quantized control inputs via the deep reinforcement learning (DRL) approach. To examine the impact of different placements of the quantizer on stabilization performance, we discuss two scenarios: the quantizer placed in the environment and in the agent. In the case of 'introducing the quantizer into the agent', we further explore two variations: optimizing the parameters of the discretized continuous distribution and directly optimizing the parameters of the discrete distribution. Finally, simulation results demonstrate that the proposed proximal policy optimization (PPO)-based strategies can train DRL controllers that effectively stabilize the target system, with the approach directly learning the parameters of the discrete distribution achieving the highest stabilization efficiency among the quantization-based scenarios. |
---|---|
AbstractList | This paper investigates the boundary stabilization problem for the Korteweg-de Vries (KdV) system with quantized control inputs via the deep reinforcement learning (DRL) approach. To examine the impact of different placements of the quantizer on stabilization performance, we discuss two scenarios: the quantizer placed in the environment and in the agent. In the case of 'introducing the quantizer into the agent', we further explore two variations: optimizing the parameters of the discretized continuous distribution and directly optimizing the parameters of the discrete distribution. Finally, simulation results demonstrate that the proposed proximal policy optimization (PPO)-based strategies can train DRL controllers that effectively stabilize the target system, with the approach directly learning the parameters of the discrete distribution achieving the highest stabilization efficiency among the quantization-based scenarios. |
Author | Wang, Yibo Kang, Wen |
Author_xml | – sequence: 1 givenname: Yibo surname: Wang fullname: Wang, Yibo email: 3220221688@bit.edu.cn organization: Beijing Institute of Technology,School of Mathematics and Statistics,Beijing,P. R. China,100081 – sequence: 2 givenname: Wen surname: Kang fullname: Kang, Wen email: kangwen@amss.ac.cn organization: Beijing Institute of Technology,School of Mathematics and Statistics,Beijing,P. R. China,100081 |
BookMark | eNo1kNlKAzEYhaMo2FbfQDAvMGOWmSze1cGlUGjdrss_M0mNdJKSpBf69A5Urw58HA4fZ4rOfPAGoRtKSsY11bdN0whOpSgZYVVJiRBUV-IETbVSsla0VvUpmlDNq2JsqQs0TemLEEE05RP0-WqctyF2ZjA-46WB6J3f4hHhlwP47H4gu-BxsPg-HHwP8Rs3wecYdnjh94ec7vB8JMMeokvH4nq9KlpIpsdvOUI2W2fSJTq3sEvm6i9n6OPx4b15Lparp0UzXxZutMsFZ8Qy1VIrKma1BehrqKXq6g6AKyuVkJRUilEqRMuB2w6M7PoOBNM9ly2foevjrjPGbPbRDaPx5v8W_gvE2Frm |
ContentType | Conference Proceeding |
DBID | 6IE 6IL CBEJK RIE RIL |
DOI | 10.23919/CCC63176.2024.10661946 |
DatabaseName | IEEE Electronic Library (IEL) Conference Proceedings IEEE Xplore POP ALL IEEE Xplore All Conference Proceedings IEEE Electronic Library (IEL) IEEE Proceedings Order Plans (POP All) 1998-Present |
DatabaseTitleList | |
Database_xml | – sequence: 1 dbid: RIE name: IEEE/IET Electronic Library url: https://proxy.k.utb.cz/login?url=https://ieeexplore.ieee.org/ sourceTypes: Publisher |
DeliveryMethod | fulltext_linktorsrc |
Discipline | Engineering |
EISBN | 9887581585 9789887581581 |
EISSN | 1934-1768 |
EndPage | 1098 |
ExternalDocumentID | 10661946 |
Genre | orig-research |
GrantInformation_xml | – fundername: National Natural Science Foundation of China funderid: 10.13039/501100001809 |
GroupedDBID | 29B 6IE 6IF 6IK 6IL 6IN AAJGR AAWTH ABLEC ACGFS ADZIZ ALMA_UNASSIGNED_HOLDINGS BEFXN BFFAM BGNUA BKEBE BPEOZ CBEJK CHZPO IEGSK IPLJI M43 OCL RIE RIL |
ID | FETCH-LOGICAL-i176t-320f28b1f642f9faad5a578c5caa38f7867104821166b3a3fcae7cdca629d37b3 |
IEDL.DBID | RIE |
IngestDate | Wed Aug 27 02:00:25 EDT 2025 |
IsPeerReviewed | false |
IsScholarly | true |
Language | English |
LinkModel | DirectLink |
MergedId | FETCHMERGED-LOGICAL-i176t-320f28b1f642f9faad5a578c5caa38f7867104821166b3a3fcae7cdca629d37b3 |
PageCount | 6 |
ParticipantIDs | ieee_primary_10661946 |
PublicationCentury | 2000 |
PublicationDate | 2024-July-28 |
PublicationDateYYYYMMDD | 2024-07-28 |
PublicationDate_xml | – month: 07 year: 2024 text: 2024-July-28 day: 28 |
PublicationDecade | 2020 |
PublicationTitle | Chinese Control Conference |
PublicationTitleAbbrev | CCC |
PublicationYear | 2024 |
Publisher | Technical Committee on Control Theory, Chinese Association of Automation |
Publisher_xml | – name: Technical Committee on Control Theory, Chinese Association of Automation |
SSID | ssj0060913 |
Score | 2.263884 |
Snippet | This paper investigates the boundary stabilization problem for the Korteweg-de Vries (KdV) system with quantized control inputs via the deep reinforcement... |
SourceID | ieee |
SourceType | Publisher |
StartPage | 1093 |
SubjectTerms | Boundary stabilization Deep reinforcement learning Input quantization Optimization PD control Quantization (signal) Simulation The nonlinear Korteweg-de Vries equation |
Title | Reinforcement Learning for Quantization of Boundary Control Inputs: A Comparison of PPO-based Strategies |
URI | https://ieeexplore.ieee.org/document/10661946 |
hasFullText | 1 |
inHoldings | 1 |
isFullTextHit | |
isPrint | |
link | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV1LS8QwEA7unvTia8U3OXhNbZM2bb1pcVkF11Vc2NuSpBMVobvstgf99SZpuz5A8FZCQ8uEma8zne8bhM6EL2MwYY8IlTASGowgIpGUcBAG7aQGqi1R-G7IB-PwdhJNGrK648IAgGs-A89eun_5-UxVtlRmPJzbpJt3UMdkbjVZqw273Apc1g1clKVBep5lGTfgaNsQaOi1W38MUXEY0t9Ew_bpdevIm1eV0lMfv4QZ__16W6j3RdfDoxUQbaM1KHbQxjelwV308ghOIlW5aiBuVFWfsVnCD5WxbkPHxDONr9ykpcU7zuo2dnxTzKtyeYEvcbaaWmhvHI3uiUXBHLcat7DsoXH_-ikbkGbIAnk1xikJo76miQy0SUR0qoXII2G8WEVKCJbo2OrfGS83eSLnkgmmlYBY5UpwmuYslmwPdYtZAfsI-1LoJAzASvSH1FfC134OkQkZKlDAxQHqWaNN57WOxrS11-Ef60do3Z6draTS5Bh1y0UFJ-YToJSn7ug_AeW_sis |
linkProvider | IEEE |
linkToHtml | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV3NT4MwFG90HtSLXzN-24PXIrRQwJsSl023Oc2W7La0pVVjwpYNDvrX2xaYH4mJN9JAIA_e-_Fe3-_3ALhgLg-lDnuIiYggX2MEYhHHiEqm0Y4riZUhCvf6tD3y78bBuCKrWy6MlNI2n0nHHNq9_HQqClMq0x5OTdJNV8GaBn4_LuladeClRuKybOHCJPbiyyRJqIZH04iAfae--McYFYsirS3Qr-9fNo-8OUXOHfHxS5rx3w-4DZpfhD04WELRDliR2S7Y_KY1uAdenqQVSRW2HggrXdVnqJfgY6HtWxEy4VTBGztraf4Ok7KRHXayWZEvruA1TJZzC82Jg8EDMjiYwlrlVi6aYNS6HSZtVI1ZQK_aODki2FU44p7SqYiKFWNpwLQfi0AwRiIVGgU87ec6U6SUE0aUYDIUqWAUxykJOdkHjWyayQMAXc5U5HvSiPT72BXMVW4qAx00hCckZYegaYw2mZVKGpPaXkd_rJ-D9faw1510O_37Y7Bh3qOpq-LoBDTyeSFP9Q9Bzs_sZ_AJbM-1ew |
openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=proceeding&rft.title=Chinese+Control+Conference&rft.atitle=Reinforcement+Learning+for+Quantization+of+Boundary+Control+Inputs%3A+A+Comparison+of+PPO-based+Strategies&rft.au=Wang%2C+Yibo&rft.au=Kang%2C+Wen&rft.date=2024-07-28&rft.pub=Technical+Committee+on+Control+Theory%2C+Chinese+Association+of+Automation&rft.eissn=1934-1768&rft.spage=1093&rft.epage=1098&rft_id=info:doi/10.23919%2FCCC63176.2024.10661946&rft.externalDocID=10661946 |