Adaptive actor-critic control of robots with integral invariant manifold
The actor-critic scheme stands for a powerful algorithm to design controllers for linear and non-linear systems subject to changing or highly uncertain dynamics. In particular, the actor-critic scheme that has succeeded is typically based on two neural network stages in a hierarchical architecture w...
Saved in:
Published in | 2021 IEEE CHILEAN Conference on Electrical, Electronics Engineering, Information and Communication Technologies (CHILECON) pp. 1 - 6 |
---|---|
Main Authors | , , |
Format | Conference Proceeding |
Language | English |
Published |
IEEE
06.12.2021
|
Subjects | |
Online Access | Get full text |
DOI | 10.1109/CHILECON54041.2021.9703056 |
Cover
Abstract | The actor-critic scheme stands for a powerful algorithm to design controllers for linear and non-linear systems subject to changing or highly uncertain dynamics. In particular, the actor-critic scheme that has succeeded is typically based on two neural network stages in a hierarchical architecture where the critic stage approximates the reward cost function. In contrast, the dynamic of the system is estimated by another neural network in the actor stage. This paper proposes an adaptive actor-critic robot learning on a lower dimension invariant error manifold as part of the Performance Evaluator. The proposed scheme guarantees an envelope of exponential convergence of tracking errors using a modified Lyapunov function, throughout integral sliding mode enforced for all time, where this becomes fundamental to drive also the learning of Reward function. Simulations show a non-linear dynamical robot learning tracking a time-varying trajectory under this Reinforcement Learning scheme. |
---|---|
AbstractList | The actor-critic scheme stands for a powerful algorithm to design controllers for linear and non-linear systems subject to changing or highly uncertain dynamics. In particular, the actor-critic scheme that has succeeded is typically based on two neural network stages in a hierarchical architecture where the critic stage approximates the reward cost function. In contrast, the dynamic of the system is estimated by another neural network in the actor stage. This paper proposes an adaptive actor-critic robot learning on a lower dimension invariant error manifold as part of the Performance Evaluator. The proposed scheme guarantees an envelope of exponential convergence of tracking errors using a modified Lyapunov function, throughout integral sliding mode enforced for all time, where this becomes fundamental to drive also the learning of Reward function. Simulations show a non-linear dynamical robot learning tracking a time-varying trajectory under this Reinforcement Learning scheme. |
Author | Parra-Vega, Vicente Pantoja-Garcia, Luis Garcia-Rodriguez, Rodolfo |
Author_xml | – sequence: 1 givenname: Luis surname: Pantoja-Garcia fullname: Pantoja-Garcia, Luis email: luis.pantoja@cinvestav.mx organization: Research Center for Advanced Studies (CINVESTAV),Robotics and Advanced Manufacturing Department,Saltillo,Coah,Mexico,25900 – sequence: 2 givenname: Rodolfo surname: Garcia-Rodriguez fullname: Garcia-Rodriguez, Rodolfo email: rogarcia@upmh.edu.mx organization: Universidad Politecnica Metropolitana de Hidalgo,Aeronautical Engineering Program and Postgraduate Program in Aerospacial Engineering,Tolcayuca,Mexico,43860 – sequence: 3 givenname: Vicente surname: Parra-Vega fullname: Parra-Vega, Vicente email: vparra@cinvestav.mx organization: Research Center for Advanced Studies (CINVESTAV),Robotics and Advanced Manufacturing Department,Saltillo,Coah,Mexico,25900 |
BookMark | eNotj01LAzEYhCPoQWt_gZfgfdd8fxzLUt3CYi96LtndNxrYJiUNFf-9AXuZGRgYnnlAtzFFQOiZkpZSYl-6fjdsu_27FETQlhFGW6sJJ1LdoLXVhipVK6O5uUf9ZnanEi6A3VRSbqYcSpjwlGLJacHJ45zGVM74J5RvHGKBr-yWGi4uBxcLProYfFrmR3Tn3XKG9dVX6PN1-9H1zbB_23WboQmU89JU8QTYKLyuSFApBIDVnnkqpHB2JIoaQxkYNTMpZ6ZnEKBGC85MahR8hZ7-dwMAHE45HF3-PVz_8T8HSEv1 |
ContentType | Conference Proceeding |
DBID | 6IE 6IL CBEJK RIE RIL |
DOI | 10.1109/CHILECON54041.2021.9703056 |
DatabaseName | IEEE Electronic Library (IEL) Conference Proceedings IEEE Xplore POP ALL IEEE Xplore All Conference Proceedings IEEE Electronic Library (IEL) IEEE Proceedings Order Plans (POP All) 1998-Present |
DatabaseTitleList | |
Database_xml | – sequence: 1 dbid: RIE name: IEEE Electronic Library (IEL) url: https://proxy.k.utb.cz/login?url=https://ieeexplore.ieee.org/ sourceTypes: Publisher |
DeliveryMethod | fulltext_linktorsrc |
EISBN | 9781665408738 1665408731 |
EndPage | 6 |
ExternalDocumentID | 9703056 |
Genre | orig-research |
GroupedDBID | 6IE 6IL CBEJK RIE RIL |
ID | FETCH-LOGICAL-i133t-133f0e2b4f7703e0874ee97f2f1454a9b0618812e86d255d27de4e6b9ea8c6b43 |
IEDL.DBID | RIE |
IngestDate | Thu Jun 29 18:37:33 EDT 2023 |
IsPeerReviewed | false |
IsScholarly | false |
Language | English |
LinkModel | DirectLink |
MergedId | FETCHMERGED-LOGICAL-i133t-133f0e2b4f7703e0874ee97f2f1454a9b0618812e86d255d27de4e6b9ea8c6b43 |
PageCount | 6 |
ParticipantIDs | ieee_primary_9703056 |
PublicationCentury | 2000 |
PublicationDate | 2021-Dec.-6 |
PublicationDateYYYYMMDD | 2021-12-06 |
PublicationDate_xml | – month: 12 year: 2021 text: 2021-Dec.-6 day: 06 |
PublicationDecade | 2020 |
PublicationTitle | 2021 IEEE CHILEAN Conference on Electrical, Electronics Engineering, Information and Communication Technologies (CHILECON) |
PublicationTitleAbbrev | CHILECON |
PublicationYear | 2021 |
Publisher | IEEE |
Publisher_xml | – name: IEEE |
Score | 1.7875463 |
Snippet | The actor-critic scheme stands for a powerful algorithm to design controllers for linear and non-linear systems subject to changing or highly uncertain... |
SourceID | ieee |
SourceType | Publisher |
StartPage | 1 |
SubjectTerms | Adaptive-critic scheme Cost function Heuristic algorithms Invariant manifold Manifolds Neural Network Neural networks Performance evaluator Reinforcement learning Robot learning Robot manipulator Trajectory |
Title | Adaptive actor-critic control of robots with integral invariant manifold |
URI | https://ieeexplore.ieee.org/document/9703056 |
hasFullText | 1 |
inHoldings | 1 |
isFullTextHit | |
isPrint | |
link | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV3PS8MwFA7bTp5UNvE3OXi0XZumaXqU4ajixIOD3UZ-vIA41jE7D_71vmR1onjwFgql4b2kX7_0ve8j5EoZnpkSbJQ5AxHPUxcpZBmR1U6J3GWpEL53ePIoqim_n-WzDrne9cIAQCg-g9gPw798W5uNPyoblmF5ii7p4jLb9mq1OqJpUg5H1d2Dd__DbxDumR9L4_aGH84pATjG-2Ty9chtvchrvGl0bD5-qTH-d04HZPDdokefduBzSDqw7JPqxqqVf3_R4KITmeBjQNtydFo7uq513bxRf_pKW6WIBQ7ekTFjiKkXw3D1wg7IdHz7PKqi1iohekGS6Q3lM5cA09wVOB9IZMEBysIxl_Kcq1IjbEvEcpDCIomwrLDAQegSlDRC8-yI9Jb1Eo4JzXGfO8WQXeqMy4Tr3MrUYUqZRDYF7IT0fRTmq60axrwNwOnfl8_Ins9EKAAR56TXrDdwgTDe6MuQv08yJJ7X |
linkProvider | IEEE |
linkToHtml | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV09T8MwED2VMsAEqEV844GRpPlwnGREFVUKbcXQSt2qOD5LiKqp2pSBX8_ZDUUgBjbLUhTrLsnzc-7eA7jLCx4WKSon1AU6PPK1kxPLcJTUuYh06AtheoeHI5FN-NM0mjbgftcLg4i2-AxdM7T_8lVZbMxRWSe1j6fYg33CfR5tu7VqJVHfSzvdrD8w_n-0C-GG-wW-W1_ywzvFQkfvCIZfN91WjLy5m0q6xccvPcb_ruoY2t9NeuxlBz8n0MBFC7IHlS_NF4xZHx2nsE4GrC5IZ6Vmq1KW1ZqZ81dWa0XMafBOnJmCzIwchi7nqg2T3uO4mzm1WYLzSjTTWMqH2sNAch3TetBLYo6YxjrQPsUqTyUBd0JojolQRCNUECvkKGSKeVIIycNTaC7KBZ4Bi-hN13lA_FKGPPG4jFTia0pqkBCfwuAcWiYKs-VWD2NWB-Di7-lbOMjGw8Fs0B89X8KhyYotBxFX0KxWG7wmUK_kjc3lJ7UBoiQ |
openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=proceeding&rft.title=2021+IEEE+CHILEAN+Conference+on+Electrical%2C+Electronics+Engineering%2C+Information+and+Communication+Technologies+%28CHILECON%29&rft.atitle=Adaptive+actor-critic+control+of+robots+with+integral+invariant+manifold&rft.au=Pantoja-Garcia%2C+Luis&rft.au=Garcia-Rodriguez%2C+Rodolfo&rft.au=Parra-Vega%2C+Vicente&rft.date=2021-12-06&rft.pub=IEEE&rft.spage=1&rft.epage=6&rft_id=info:doi/10.1109%2FCHILECON54041.2021.9703056&rft.externalDocID=9703056 |