Reinforcement Learning for Quantization of Boundary Control Inputs: A Comparison of PPO-based Strategies

This paper investigates the boundary stabilization problem for the Korteweg-de Vries (KdV) system with quantized control inputs via the deep reinforcement learning (DRL) approach. To examine the impact of different placements of the quantizer on stabilization performance, we discuss two scenarios: t...

Full description

Saved in:

Bibliographic Details
Published in	Chinese Control Conference pp. 1093 - 1098
Main Authors	Wang, Yibo, Kang, Wen
Format	Conference Proceeding
Language	English
Published	Technical Committee on Control Theory, Chinese Association of Automation 28.07.2024
Subjects	Boundary stabilization Deep reinforcement learning Input quantization Optimization PD control Quantization (signal) Simulation The nonlinear Korteweg-de Vries equation
Online Access	Get full text
ISSN	1934-1768
DOI	10.23919/CCC63176.2024.10661946

Cover

Loading…

Abstract	This paper investigates the boundary stabilization problem for the Korteweg-de Vries (KdV) system with quantized control inputs via the deep reinforcement learning (DRL) approach. To examine the impact of different placements of the quantizer on stabilization performance, we discuss two scenarios: the quantizer placed in the environment and in the agent. In the case of 'introducing the quantizer into the agent', we further explore two variations: optimizing the parameters of the discretized continuous distribution and directly optimizing the parameters of the discrete distribution. Finally, simulation results demonstrate that the proposed proximal policy optimization (PPO)-based strategies can train DRL controllers that effectively stabilize the target system, with the approach directly learning the parameters of the discrete distribution achieving the highest stabilization efficiency among the quantization-based scenarios.
AbstractList	This paper investigates the boundary stabilization problem for the Korteweg-de Vries (KdV) system with quantized control inputs via the deep reinforcement learning (DRL) approach. To examine the impact of different placements of the quantizer on stabilization performance, we discuss two scenarios: the quantizer placed in the environment and in the agent. In the case of 'introducing the quantizer into the agent', we further explore two variations: optimizing the parameters of the discretized continuous distribution and directly optimizing the parameters of the discrete distribution. Finally, simulation results demonstrate that the proposed proximal policy optimization (PPO)-based strategies can train DRL controllers that effectively stabilize the target system, with the approach directly learning the parameters of the discrete distribution achieving the highest stabilization efficiency among the quantization-based scenarios.
Author	Wang, Yibo Kang, Wen
Author_xml	– sequence: 1 givenname: Yibo surname: Wang fullname: Wang, Yibo email: 3220221688@bit.edu.cn organization: Beijing Institute of Technology,School of Mathematics and Statistics,Beijing,P. R. China,100081 – sequence: 2 givenname: Wen surname: Kang fullname: Kang, Wen email: kangwen@amss.ac.cn organization: Beijing Institute of Technology,School of Mathematics and Statistics,Beijing,P. R. China,100081
BookMark	eNo1kNlKAzEYhaMo2FbfQDAvMGOWmSze1cGlUGjdrss_M0mNdJKSpBf69A5Urw58HA4fZ4rOfPAGoRtKSsY11bdN0whOpSgZYVVJiRBUV-IETbVSsla0VvUpmlDNq2JsqQs0TemLEEE05RP0-WqctyF2ZjA-46WB6J3f4hHhlwP47H4gu-BxsPg-HHwP8Rs3wecYdnjh94ec7vB8JMMeokvH4nq9KlpIpsdvOUI2W2fSJTq3sEvm6i9n6OPx4b15Lparp0UzXxZutMsFZ8Qy1VIrKma1BehrqKXq6g6AKyuVkJRUilEqRMuB2w6M7PoOBNM9ly2foevjrjPGbPbRDaPx5v8W_gvE2Frm
ContentType	Conference Proceeding
DBID	6IE 6IL CBEJK RIE RIL
DOI	10.23919/CCC63176.2024.10661946
DatabaseName	IEEE Electronic Library (IEL) Conference Proceedings IEEE Xplore POP ALL IEEE Xplore All Conference Proceedings IEEE Electronic Library (IEL) IEEE Proceedings Order Plans (POP All) 1998-Present
DatabaseTitleList
Database_xml	– sequence: 1 dbid: RIE name: IEEE/IET Electronic Library url: https://proxy.k.utb.cz/login?url=https://ieeexplore.ieee.org/ sourceTypes: Publisher
DeliveryMethod	fulltext_linktorsrc
Discipline	Engineering
EISBN	9887581585 9789887581581
EISSN	1934-1768
EndPage	1098
ExternalDocumentID	10661946
Genre	orig-research
GrantInformation_xml	– fundername: National Natural Science Foundation of China funderid: 10.13039/501100001809
GroupedDBID	29B 6IE 6IF 6IK 6IL 6IN AAJGR AAWTH ABLEC ACGFS ADZIZ ALMA_UNASSIGNED_HOLDINGS BEFXN BFFAM BGNUA BKEBE BPEOZ CBEJK CHZPO IEGSK IPLJI M43 OCL RIE RIL
ID	FETCH-LOGICAL-i176t-320f28b1f642f9faad5a578c5caa38f7867104821166b3a3fcae7cdca629d37b3
IEDL.DBID	RIE
IngestDate	Wed Aug 27 02:00:25 EDT 2025
IsPeerReviewed	false
IsScholarly	true
Language	English
LinkModel	DirectLink
MergedId	FETCHMERGED-LOGICAL-i176t-320f28b1f642f9faad5a578c5caa38f7867104821166b3a3fcae7cdca629d37b3
PageCount	6
ParticipantIDs	ieee_primary_10661946
PublicationCentury	2000
PublicationDate	2024-July-28
PublicationDateYYYYMMDD	2024-07-28
PublicationDate_xml	– month: 07 year: 2024 text: 2024-July-28 day: 28
PublicationDecade	2020
PublicationTitle	Chinese Control Conference
PublicationTitleAbbrev	CCC
PublicationYear	2024
Publisher	Technical Committee on Control Theory, Chinese Association of Automation
Publisher_xml	– name: Technical Committee on Control Theory, Chinese Association of Automation
SSID	ssj0060913
Score	2.263884
Snippet	This paper investigates the boundary stabilization problem for the Korteweg-de Vries (KdV) system with quantized control inputs via the deep reinforcement...
SourceID	ieee
SourceType	Publisher
StartPage	1093
SubjectTerms	Boundary stabilization Deep reinforcement learning Input quantization Optimization PD control Quantization (signal) Simulation The nonlinear Korteweg-de Vries equation
Title	Reinforcement Learning for Quantization of Boundary Control Inputs: A Comparison of PPO-based Strategies
URI	https://ieeexplore.ieee.org/document/10661946
hasFullText	1
inHoldings	1
isFullTextHit
isPrint
link	http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV1LS8QwEA7unvTia8U3OXhNbZM2bb1pcVkF11Vc2NuSpBMVobvstgf99SZpuz5A8FZCQ8uEma8zne8bhM6EL2MwYY8IlTASGowgIpGUcBAG7aQGqi1R-G7IB-PwdhJNGrK648IAgGs-A89eun_5-UxVtlRmPJzbpJt3UMdkbjVZqw273Apc1g1clKVBep5lGTfgaNsQaOi1W38MUXEY0t9Ew_bpdevIm1eV0lMfv4QZ__16W6j3RdfDoxUQbaM1KHbQxjelwV308ghOIlW5aiBuVFWfsVnCD5WxbkPHxDONr9ykpcU7zuo2dnxTzKtyeYEvcbaaWmhvHI3uiUXBHLcat7DsoXH_-ikbkGbIAnk1xikJo76miQy0SUR0qoXII2G8WEVKCJbo2OrfGS83eSLnkgmmlYBY5UpwmuYslmwPdYtZAfsI-1LoJAzASvSH1FfC134OkQkZKlDAxQHqWaNN57WOxrS11-Ef60do3Z6draTS5Bh1y0UFJ-YToJSn7ug_AeW_sis
linkProvider	IEEE
linkToHtml	http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV3NT4MwFG90HtSLXzN-24PXIrRQwJsSl023Oc2W7La0pVVjwpYNDvrX2xaYH4mJN9JAIA_e-_Fe3-_3ALhgLg-lDnuIiYggX2MEYhHHiEqm0Y4riZUhCvf6tD3y78bBuCKrWy6MlNI2n0nHHNq9_HQqClMq0x5OTdJNV8GaBn4_LuladeClRuKybOHCJPbiyyRJqIZH04iAfae--McYFYsirS3Qr-9fNo-8OUXOHfHxS5rx3w-4DZpfhD04WELRDliR2S7Y_KY1uAdenqQVSRW2HggrXdVnqJfgY6HtWxEy4VTBGztraf4Ok7KRHXayWZEvruA1TJZzC82Jg8EDMjiYwlrlVi6aYNS6HSZtVI1ZQK_aODki2FU44p7SqYiKFWNpwLQfi0AwRiIVGgU87ec6U6SUE0aUYDIUqWAUxykJOdkHjWyayQMAXc5U5HvSiPT72BXMVW4qAx00hCckZYegaYw2mZVKGpPaXkd_rJ-D9faw1510O_37Y7Bh3qOpq-LoBDTyeSFP9Q9Bzs_sZ_AJbM-1ew
openUrl	ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=proceeding&rft.title=Chinese+Control+Conference&rft.atitle=Reinforcement+Learning+for+Quantization+of+Boundary+Control+Inputs%3A+A+Comparison+of+PPO-based+Strategies&rft.au=Wang%2C+Yibo&rft.au=Kang%2C+Wen&rft.date=2024-07-28&rft.pub=Technical+Committee+on+Control+Theory%2C+Chinese+Association+of+Automation&rft.eissn=1934-1768&rft.spage=1093&rft.epage=1098&rft_id=info:doi/10.23919%2FCCC63176.2024.10661946&rft.externalDocID=10661946