Adaptive actor-critic control of robots with integral invariant manifold

The actor-critic scheme stands for a powerful algorithm to design controllers for linear and non-linear systems subject to changing or highly uncertain dynamics. In particular, the actor-critic scheme that has succeeded is typically based on two neural network stages in a hierarchical architecture w...

Full description

Saved in:
Bibliographic Details
Published in2021 IEEE CHILEAN Conference on Electrical, Electronics Engineering, Information and Communication Technologies (CHILECON) pp. 1 - 6
Main Authors Pantoja-Garcia, Luis, Garcia-Rodriguez, Rodolfo, Parra-Vega, Vicente
Format Conference Proceeding
LanguageEnglish
Published IEEE 06.12.2021
Subjects
Online AccessGet full text
DOI10.1109/CHILECON54041.2021.9703056

Cover

Abstract The actor-critic scheme stands for a powerful algorithm to design controllers for linear and non-linear systems subject to changing or highly uncertain dynamics. In particular, the actor-critic scheme that has succeeded is typically based on two neural network stages in a hierarchical architecture where the critic stage approximates the reward cost function. In contrast, the dynamic of the system is estimated by another neural network in the actor stage. This paper proposes an adaptive actor-critic robot learning on a lower dimension invariant error manifold as part of the Performance Evaluator. The proposed scheme guarantees an envelope of exponential convergence of tracking errors using a modified Lyapunov function, throughout integral sliding mode enforced for all time, where this becomes fundamental to drive also the learning of Reward function. Simulations show a non-linear dynamical robot learning tracking a time-varying trajectory under this Reinforcement Learning scheme.
AbstractList The actor-critic scheme stands for a powerful algorithm to design controllers for linear and non-linear systems subject to changing or highly uncertain dynamics. In particular, the actor-critic scheme that has succeeded is typically based on two neural network stages in a hierarchical architecture where the critic stage approximates the reward cost function. In contrast, the dynamic of the system is estimated by another neural network in the actor stage. This paper proposes an adaptive actor-critic robot learning on a lower dimension invariant error manifold as part of the Performance Evaluator. The proposed scheme guarantees an envelope of exponential convergence of tracking errors using a modified Lyapunov function, throughout integral sliding mode enforced for all time, where this becomes fundamental to drive also the learning of Reward function. Simulations show a non-linear dynamical robot learning tracking a time-varying trajectory under this Reinforcement Learning scheme.
Author Parra-Vega, Vicente
Pantoja-Garcia, Luis
Garcia-Rodriguez, Rodolfo
Author_xml – sequence: 1
  givenname: Luis
  surname: Pantoja-Garcia
  fullname: Pantoja-Garcia, Luis
  email: luis.pantoja@cinvestav.mx
  organization: Research Center for Advanced Studies (CINVESTAV),Robotics and Advanced Manufacturing Department,Saltillo,Coah,Mexico,25900
– sequence: 2
  givenname: Rodolfo
  surname: Garcia-Rodriguez
  fullname: Garcia-Rodriguez, Rodolfo
  email: rogarcia@upmh.edu.mx
  organization: Universidad Politecnica Metropolitana de Hidalgo,Aeronautical Engineering Program and Postgraduate Program in Aerospacial Engineering,Tolcayuca,Mexico,43860
– sequence: 3
  givenname: Vicente
  surname: Parra-Vega
  fullname: Parra-Vega, Vicente
  email: vparra@cinvestav.mx
  organization: Research Center for Advanced Studies (CINVESTAV),Robotics and Advanced Manufacturing Department,Saltillo,Coah,Mexico,25900
BookMark eNotj01LAzEYhCPoQWt_gZfgfdd8fxzLUt3CYi96LtndNxrYJiUNFf-9AXuZGRgYnnlAtzFFQOiZkpZSYl-6fjdsu_27FETQlhFGW6sJJ1LdoLXVhipVK6O5uUf9ZnanEi6A3VRSbqYcSpjwlGLJacHJ45zGVM74J5RvHGKBr-yWGi4uBxcLProYfFrmR3Tn3XKG9dVX6PN1-9H1zbB_23WboQmU89JU8QTYKLyuSFApBIDVnnkqpHB2JIoaQxkYNTMpZ6ZnEKBGC85MahR8hZ7-dwMAHE45HF3-PVz_8T8HSEv1
ContentType Conference Proceeding
DBID 6IE
6IL
CBEJK
RIE
RIL
DOI 10.1109/CHILECON54041.2021.9703056
DatabaseName IEEE Electronic Library (IEL) Conference Proceedings
IEEE Xplore POP ALL
IEEE Xplore All Conference Proceedings
IEEE Electronic Library (IEL)
IEEE Proceedings Order Plans (POP All) 1998-Present
DatabaseTitleList
Database_xml – sequence: 1
  dbid: RIE
  name: IEEE Electronic Library (IEL)
  url: https://proxy.k.utb.cz/login?url=https://ieeexplore.ieee.org/
  sourceTypes: Publisher
DeliveryMethod fulltext_linktorsrc
EISBN 9781665408738
1665408731
EndPage 6
ExternalDocumentID 9703056
Genre orig-research
GroupedDBID 6IE
6IL
CBEJK
RIE
RIL
ID FETCH-LOGICAL-i133t-133f0e2b4f7703e0874ee97f2f1454a9b0618812e86d255d27de4e6b9ea8c6b43
IEDL.DBID RIE
IngestDate Thu Jun 29 18:37:33 EDT 2023
IsPeerReviewed false
IsScholarly false
Language English
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-i133t-133f0e2b4f7703e0874ee97f2f1454a9b0618812e86d255d27de4e6b9ea8c6b43
PageCount 6
ParticipantIDs ieee_primary_9703056
PublicationCentury 2000
PublicationDate 2021-Dec.-6
PublicationDateYYYYMMDD 2021-12-06
PublicationDate_xml – month: 12
  year: 2021
  text: 2021-Dec.-6
  day: 06
PublicationDecade 2020
PublicationTitle 2021 IEEE CHILEAN Conference on Electrical, Electronics Engineering, Information and Communication Technologies (CHILECON)
PublicationTitleAbbrev CHILECON
PublicationYear 2021
Publisher IEEE
Publisher_xml – name: IEEE
Score 1.7875463
Snippet The actor-critic scheme stands for a powerful algorithm to design controllers for linear and non-linear systems subject to changing or highly uncertain...
SourceID ieee
SourceType Publisher
StartPage 1
SubjectTerms Adaptive-critic scheme
Cost function
Heuristic algorithms
Invariant manifold
Manifolds
Neural Network
Neural networks
Performance evaluator
Reinforcement learning
Robot learning
Robot manipulator
Trajectory
Title Adaptive actor-critic control of robots with integral invariant manifold
URI https://ieeexplore.ieee.org/document/9703056
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
link http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV3PS8MwFA7bTp5UNvE3OXi0XZumaXqU4ajixIOD3UZ-vIA41jE7D_71vmR1onjwFgql4b2kX7_0ve8j5EoZnpkSbJQ5AxHPUxcpZBmR1U6J3GWpEL53ePIoqim_n-WzDrne9cIAQCg-g9gPw798W5uNPyoblmF5ii7p4jLb9mq1OqJpUg5H1d2Dd__DbxDumR9L4_aGH84pATjG-2Ty9chtvchrvGl0bD5-qTH-d04HZPDdokefduBzSDqw7JPqxqqVf3_R4KITmeBjQNtydFo7uq513bxRf_pKW6WIBQ7ekTFjiKkXw3D1wg7IdHz7PKqi1iohekGS6Q3lM5cA09wVOB9IZMEBysIxl_Kcq1IjbEvEcpDCIomwrLDAQegSlDRC8-yI9Jb1Eo4JzXGfO8WQXeqMy4Tr3MrUYUqZRDYF7IT0fRTmq60axrwNwOnfl8_Ins9EKAAR56TXrDdwgTDe6MuQv08yJJ7X
linkProvider IEEE
linkToHtml http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV09T8MwED2VMsAEqEV844GRpPlwnGREFVUKbcXQSt2qOD5LiKqp2pSBX8_ZDUUgBjbLUhTrLsnzc-7eA7jLCx4WKSon1AU6PPK1kxPLcJTUuYh06AtheoeHI5FN-NM0mjbgftcLg4i2-AxdM7T_8lVZbMxRWSe1j6fYg33CfR5tu7VqJVHfSzvdrD8w_n-0C-GG-wW-W1_ywzvFQkfvCIZfN91WjLy5m0q6xccvPcb_ruoY2t9NeuxlBz8n0MBFC7IHlS_NF4xZHx2nsE4GrC5IZ6Vmq1KW1ZqZ81dWa0XMafBOnJmCzIwchi7nqg2T3uO4mzm1WYLzSjTTWMqH2sNAch3TetBLYo6YxjrQPsUqTyUBd0JojolQRCNUECvkKGSKeVIIycNTaC7KBZ4Bi-hN13lA_FKGPPG4jFTia0pqkBCfwuAcWiYKs-VWD2NWB-Di7-lbOMjGw8Fs0B89X8KhyYotBxFX0KxWG7wmUK_kjc3lJ7UBoiQ
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=proceeding&rft.title=2021+IEEE+CHILEAN+Conference+on+Electrical%2C+Electronics+Engineering%2C+Information+and+Communication+Technologies+%28CHILECON%29&rft.atitle=Adaptive+actor-critic+control+of+robots+with+integral+invariant+manifold&rft.au=Pantoja-Garcia%2C+Luis&rft.au=Garcia-Rodriguez%2C+Rodolfo&rft.au=Parra-Vega%2C+Vicente&rft.date=2021-12-06&rft.pub=IEEE&rft.spage=1&rft.epage=6&rft_id=info:doi/10.1109%2FCHILECON54041.2021.9703056&rft.externalDocID=9703056