Reinforcement Learning for Quantization of Boundary Control Inputs: A Comparison of PPO-based Strategies

This paper investigates the boundary stabilization problem for the Korteweg-de Vries (KdV) system with quantized control inputs via the deep reinforcement learning (DRL) approach. To examine the impact of different placements of the quantizer on stabilization performance, we discuss two scenarios: t...

Full description

Saved in:
Bibliographic Details
Published inChinese Control Conference pp. 1093 - 1098
Main Authors Wang, Yibo, Kang, Wen
Format Conference Proceeding
LanguageEnglish
Published Technical Committee on Control Theory, Chinese Association of Automation 28.07.2024
Subjects
Online AccessGet full text
ISSN1934-1768
DOI10.23919/CCC63176.2024.10661946

Cover

Loading…
Abstract This paper investigates the boundary stabilization problem for the Korteweg-de Vries (KdV) system with quantized control inputs via the deep reinforcement learning (DRL) approach. To examine the impact of different placements of the quantizer on stabilization performance, we discuss two scenarios: the quantizer placed in the environment and in the agent. In the case of 'introducing the quantizer into the agent', we further explore two variations: optimizing the parameters of the discretized continuous distribution and directly optimizing the parameters of the discrete distribution. Finally, simulation results demonstrate that the proposed proximal policy optimization (PPO)-based strategies can train DRL controllers that effectively stabilize the target system, with the approach directly learning the parameters of the discrete distribution achieving the highest stabilization efficiency among the quantization-based scenarios.
AbstractList This paper investigates the boundary stabilization problem for the Korteweg-de Vries (KdV) system with quantized control inputs via the deep reinforcement learning (DRL) approach. To examine the impact of different placements of the quantizer on stabilization performance, we discuss two scenarios: the quantizer placed in the environment and in the agent. In the case of 'introducing the quantizer into the agent', we further explore two variations: optimizing the parameters of the discretized continuous distribution and directly optimizing the parameters of the discrete distribution. Finally, simulation results demonstrate that the proposed proximal policy optimization (PPO)-based strategies can train DRL controllers that effectively stabilize the target system, with the approach directly learning the parameters of the discrete distribution achieving the highest stabilization efficiency among the quantization-based scenarios.
Author Wang, Yibo
Kang, Wen
Author_xml – sequence: 1
  givenname: Yibo
  surname: Wang
  fullname: Wang, Yibo
  email: 3220221688@bit.edu.cn
  organization: Beijing Institute of Technology,School of Mathematics and Statistics,Beijing,P. R. China,100081
– sequence: 2
  givenname: Wen
  surname: Kang
  fullname: Kang, Wen
  email: kangwen@amss.ac.cn
  organization: Beijing Institute of Technology,School of Mathematics and Statistics,Beijing,P. R. China,100081
BookMark eNo1kNlKAzEYhaMo2FbfQDAvMGOWmSze1cGlUGjdrss_M0mNdJKSpBf69A5Urw58HA4fZ4rOfPAGoRtKSsY11bdN0whOpSgZYVVJiRBUV-IETbVSsla0VvUpmlDNq2JsqQs0TemLEEE05RP0-WqctyF2ZjA-46WB6J3f4hHhlwP47H4gu-BxsPg-HHwP8Rs3wecYdnjh94ec7vB8JMMeokvH4nq9KlpIpsdvOUI2W2fSJTq3sEvm6i9n6OPx4b15Lparp0UzXxZutMsFZ8Qy1VIrKma1BehrqKXq6g6AKyuVkJRUilEqRMuB2w6M7PoOBNM9ly2foevjrjPGbPbRDaPx5v8W_gvE2Frm
ContentType Conference Proceeding
DBID 6IE
6IL
CBEJK
RIE
RIL
DOI 10.23919/CCC63176.2024.10661946
DatabaseName IEEE Electronic Library (IEL) Conference Proceedings
IEEE Xplore POP ALL
IEEE Xplore All Conference Proceedings
IEEE Electronic Library (IEL)
IEEE Proceedings Order Plans (POP All) 1998-Present
DatabaseTitleList
Database_xml – sequence: 1
  dbid: RIE
  name: IEEE/IET Electronic Library
  url: https://proxy.k.utb.cz/login?url=https://ieeexplore.ieee.org/
  sourceTypes: Publisher
DeliveryMethod fulltext_linktorsrc
Discipline Engineering
EISBN 9887581585
9789887581581
EISSN 1934-1768
EndPage 1098
ExternalDocumentID 10661946
Genre orig-research
GrantInformation_xml – fundername: National Natural Science Foundation of China
  funderid: 10.13039/501100001809
GroupedDBID 29B
6IE
6IF
6IK
6IL
6IN
AAJGR
AAWTH
ABLEC
ACGFS
ADZIZ
ALMA_UNASSIGNED_HOLDINGS
BEFXN
BFFAM
BGNUA
BKEBE
BPEOZ
CBEJK
CHZPO
IEGSK
IPLJI
M43
OCL
RIE
RIL
ID FETCH-LOGICAL-i176t-320f28b1f642f9faad5a578c5caa38f7867104821166b3a3fcae7cdca629d37b3
IEDL.DBID RIE
IngestDate Wed Aug 27 02:00:25 EDT 2025
IsPeerReviewed false
IsScholarly true
Language English
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-i176t-320f28b1f642f9faad5a578c5caa38f7867104821166b3a3fcae7cdca629d37b3
PageCount 6
ParticipantIDs ieee_primary_10661946
PublicationCentury 2000
PublicationDate 2024-July-28
PublicationDateYYYYMMDD 2024-07-28
PublicationDate_xml – month: 07
  year: 2024
  text: 2024-July-28
  day: 28
PublicationDecade 2020
PublicationTitle Chinese Control Conference
PublicationTitleAbbrev CCC
PublicationYear 2024
Publisher Technical Committee on Control Theory, Chinese Association of Automation
Publisher_xml – name: Technical Committee on Control Theory, Chinese Association of Automation
SSID ssj0060913
Score 2.263884
Snippet This paper investigates the boundary stabilization problem for the Korteweg-de Vries (KdV) system with quantized control inputs via the deep reinforcement...
SourceID ieee
SourceType Publisher
StartPage 1093
SubjectTerms Boundary stabilization
Deep reinforcement learning
Input quantization
Optimization
PD control
Quantization (signal)
Simulation
The nonlinear Korteweg-de Vries equation
Title Reinforcement Learning for Quantization of Boundary Control Inputs: A Comparison of PPO-based Strategies
URI https://ieeexplore.ieee.org/document/10661946
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
link http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV1LS8QwEA7unvTia8U3OXhNbZM2bb1pcVkF11Vc2NuSpBMVobvstgf99SZpuz5A8FZCQ8uEma8zne8bhM6EL2MwYY8IlTASGowgIpGUcBAG7aQGqi1R-G7IB-PwdhJNGrK648IAgGs-A89eun_5-UxVtlRmPJzbpJt3UMdkbjVZqw273Apc1g1clKVBep5lGTfgaNsQaOi1W38MUXEY0t9Ew_bpdevIm1eV0lMfv4QZ__16W6j3RdfDoxUQbaM1KHbQxjelwV308ghOIlW5aiBuVFWfsVnCD5WxbkPHxDONr9ykpcU7zuo2dnxTzKtyeYEvcbaaWmhvHI3uiUXBHLcat7DsoXH_-ikbkGbIAnk1xikJo76miQy0SUR0qoXII2G8WEVKCJbo2OrfGS83eSLnkgmmlYBY5UpwmuYslmwPdYtZAfsI-1LoJAzASvSH1FfC134OkQkZKlDAxQHqWaNN57WOxrS11-Ef60do3Z6draTS5Bh1y0UFJ-YToJSn7ug_AeW_sis
linkProvider IEEE
linkToHtml http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV3NT4MwFG90HtSLXzN-24PXIrRQwJsSl023Oc2W7La0pVVjwpYNDvrX2xaYH4mJN9JAIA_e-_Fe3-_3ALhgLg-lDnuIiYggX2MEYhHHiEqm0Y4riZUhCvf6tD3y78bBuCKrWy6MlNI2n0nHHNq9_HQqClMq0x5OTdJNV8GaBn4_LuladeClRuKybOHCJPbiyyRJqIZH04iAfae--McYFYsirS3Qr-9fNo-8OUXOHfHxS5rx3w-4DZpfhD04WELRDliR2S7Y_KY1uAdenqQVSRW2HggrXdVnqJfgY6HtWxEy4VTBGztraf4Ok7KRHXayWZEvruA1TJZzC82Jg8EDMjiYwlrlVi6aYNS6HSZtVI1ZQK_aODki2FU44p7SqYiKFWNpwLQfi0AwRiIVGgU87ec6U6SUE0aUYDIUqWAUxykJOdkHjWyayQMAXc5U5HvSiPT72BXMVW4qAx00hCckZYegaYw2mZVKGpPaXkd_rJ-D9faw1510O_37Y7Bh3qOpq-LoBDTyeSFP9Q9Bzs_sZ_AJbM-1ew
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=proceeding&rft.title=Chinese+Control+Conference&rft.atitle=Reinforcement+Learning+for+Quantization+of+Boundary+Control+Inputs%3A+A+Comparison+of+PPO-based+Strategies&rft.au=Wang%2C+Yibo&rft.au=Kang%2C+Wen&rft.date=2024-07-28&rft.pub=Technical+Committee+on+Control+Theory%2C+Chinese+Association+of+Automation&rft.eissn=1934-1768&rft.spage=1093&rft.epage=1098&rft_id=info:doi/10.23919%2FCCC63176.2024.10661946&rft.externalDocID=10661946