Adaptive actor-critic control of robots with integral invariant manifold

The actor-critic scheme stands for a powerful algorithm to design controllers for linear and non-linear systems subject to changing or highly uncertain dynamics. In particular, the actor-critic scheme that has succeeded is typically based on two neural network stages in a hierarchical architecture w...

Full description

Saved in:

Bibliographic Details
Published in	2021 IEEE CHILEAN Conference on Electrical, Electronics Engineering, Information and Communication Technologies (CHILECON) pp. 1 - 6
Main Authors	Pantoja-Garcia, Luis, Garcia-Rodriguez, Rodolfo, Parra-Vega, Vicente
Format	Conference Proceeding
Language	English
Published	IEEE 06.12.2021
Subjects	Adaptive-critic scheme Cost function Heuristic algorithms Invariant manifold Manifolds Neural Network Neural networks Performance evaluator Reinforcement learning Robot learning Robot manipulator Trajectory
Online Access	Get full text
DOI	10.1109/CHILECON54041.2021.9703056

Cover

Abstract	The actor-critic scheme stands for a powerful algorithm to design controllers for linear and non-linear systems subject to changing or highly uncertain dynamics. In particular, the actor-critic scheme that has succeeded is typically based on two neural network stages in a hierarchical architecture where the critic stage approximates the reward cost function. In contrast, the dynamic of the system is estimated by another neural network in the actor stage. This paper proposes an adaptive actor-critic robot learning on a lower dimension invariant error manifold as part of the Performance Evaluator. The proposed scheme guarantees an envelope of exponential convergence of tracking errors using a modified Lyapunov function, throughout integral sliding mode enforced for all time, where this becomes fundamental to drive also the learning of Reward function. Simulations show a non-linear dynamical robot learning tracking a time-varying trajectory under this Reinforcement Learning scheme.
AbstractList	The actor-critic scheme stands for a powerful algorithm to design controllers for linear and non-linear systems subject to changing or highly uncertain dynamics. In particular, the actor-critic scheme that has succeeded is typically based on two neural network stages in a hierarchical architecture where the critic stage approximates the reward cost function. In contrast, the dynamic of the system is estimated by another neural network in the actor stage. This paper proposes an adaptive actor-critic robot learning on a lower dimension invariant error manifold as part of the Performance Evaluator. The proposed scheme guarantees an envelope of exponential convergence of tracking errors using a modified Lyapunov function, throughout integral sliding mode enforced for all time, where this becomes fundamental to drive also the learning of Reward function. Simulations show a non-linear dynamical robot learning tracking a time-varying trajectory under this Reinforcement Learning scheme.
Author	Parra-Vega, Vicente Pantoja-Garcia, Luis Garcia-Rodriguez, Rodolfo
Author_xml	– sequence: 1 givenname: Luis surname: Pantoja-Garcia fullname: Pantoja-Garcia, Luis email: luis.pantoja@cinvestav.mx organization: Research Center for Advanced Studies (CINVESTAV),Robotics and Advanced Manufacturing Department,Saltillo,Coah,Mexico,25900 – sequence: 2 givenname: Rodolfo surname: Garcia-Rodriguez fullname: Garcia-Rodriguez, Rodolfo email: rogarcia@upmh.edu.mx organization: Universidad Politecnica Metropolitana de Hidalgo,Aeronautical Engineering Program and Postgraduate Program in Aerospacial Engineering,Tolcayuca,Mexico,43860 – sequence: 3 givenname: Vicente surname: Parra-Vega fullname: Parra-Vega, Vicente email: vparra@cinvestav.mx organization: Research Center for Advanced Studies (CINVESTAV),Robotics and Advanced Manufacturing Department,Saltillo,Coah,Mexico,25900
BookMark	eNotj01LAzEYhCPoQWt_gZfgfdd8fxzLUt3CYi96LtndNxrYJiUNFf-9AXuZGRgYnnlAtzFFQOiZkpZSYl-6fjdsu_27FETQlhFGW6sJJ1LdoLXVhipVK6O5uUf9ZnanEi6A3VRSbqYcSpjwlGLJacHJ45zGVM74J5RvHGKBr-yWGi4uBxcLProYfFrmR3Tn3XKG9dVX6PN1-9H1zbB_23WboQmU89JU8QTYKLyuSFApBIDVnnkqpHB2JIoaQxkYNTMpZ6ZnEKBGC85MahR8hZ7-dwMAHE45HF3-PVz_8T8HSEv1
ContentType	Conference Proceeding
DBID	6IE 6IL CBEJK RIE RIL
DOI	10.1109/CHILECON54041.2021.9703056
DatabaseName	IEEE Electronic Library (IEL) Conference Proceedings IEEE Xplore POP ALL IEEE Xplore All Conference Proceedings IEEE Electronic Library (IEL) IEEE Proceedings Order Plans (POP All) 1998-Present
DatabaseTitleList
Database_xml	– sequence: 1 dbid: RIE name: IEEE Electronic Library (IEL) url: https://proxy.k.utb.cz/login?url=https://ieeexplore.ieee.org/ sourceTypes: Publisher
DeliveryMethod	fulltext_linktorsrc
EISBN	9781665408738 1665408731
EndPage	6
ExternalDocumentID	9703056
Genre	orig-research
GroupedDBID	6IE 6IL CBEJK RIE RIL
ID	FETCH-LOGICAL-i133t-133f0e2b4f7703e0874ee97f2f1454a9b0618812e86d255d27de4e6b9ea8c6b43
IEDL.DBID	RIE
IngestDate	Thu Jun 29 18:37:33 EDT 2023
IsPeerReviewed	false
IsScholarly	false
Language	English
LinkModel	DirectLink
MergedId	FETCHMERGED-LOGICAL-i133t-133f0e2b4f7703e0874ee97f2f1454a9b0618812e86d255d27de4e6b9ea8c6b43
PageCount	6
ParticipantIDs	ieee_primary_9703056
PublicationCentury	2000
PublicationDate	2021-Dec.-6
PublicationDateYYYYMMDD	2021-12-06
PublicationDate_xml	– month: 12 year: 2021 text: 2021-Dec.-6 day: 06
PublicationDecade	2020
PublicationTitle	2021 IEEE CHILEAN Conference on Electrical, Electronics Engineering, Information and Communication Technologies (CHILECON)
PublicationTitleAbbrev	CHILECON
PublicationYear	2021
Publisher	IEEE
Publisher_xml	– name: IEEE
Score	1.7875463
Snippet	The actor-critic scheme stands for a powerful algorithm to design controllers for linear and non-linear systems subject to changing or highly uncertain...
SourceID	ieee
SourceType	Publisher
StartPage	1
SubjectTerms	Adaptive-critic scheme Cost function Heuristic algorithms Invariant manifold Manifolds Neural Network Neural networks Performance evaluator Reinforcement learning Robot learning Robot manipulator Trajectory
Title	Adaptive actor-critic control of robots with integral invariant manifold
URI	https://ieeexplore.ieee.org/document/9703056
hasFullText	1
inHoldings	1
isFullTextHit
isPrint
link	http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV3PS8MwFA7bTp5UNvE3OXi0XZumaXqU4ajixIOD3UZ-vIA41jE7D_71vmR1onjwFgql4b2kX7_0ve8j5EoZnpkSbJQ5AxHPUxcpZBmR1U6J3GWpEL53ePIoqim_n-WzDrne9cIAQCg-g9gPw798W5uNPyoblmF5ii7p4jLb9mq1OqJpUg5H1d2Dd__DbxDumR9L4_aGH84pATjG-2Ty9chtvchrvGl0bD5-qTH-d04HZPDdokefduBzSDqw7JPqxqqVf3_R4KITmeBjQNtydFo7uq513bxRf_pKW6WIBQ7ekTFjiKkXw3D1wg7IdHz7PKqi1iohekGS6Q3lM5cA09wVOB9IZMEBysIxl_Kcq1IjbEvEcpDCIomwrLDAQegSlDRC8-yI9Jb1Eo4JzXGfO8WQXeqMy4Tr3MrUYUqZRDYF7IT0fRTmq60axrwNwOnfl8_Ins9EKAAR56TXrDdwgTDe6MuQv08yJJ7X
linkProvider	IEEE
linkToHtml	http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV09T8MwED2VMsAEqEV844GRpPlwnGREFVUKbcXQSt2qOD5LiKqp2pSBX8_ZDUUgBjbLUhTrLsnzc-7eA7jLCx4WKSon1AU6PPK1kxPLcJTUuYh06AtheoeHI5FN-NM0mjbgftcLg4i2-AxdM7T_8lVZbMxRWSe1j6fYg33CfR5tu7VqJVHfSzvdrD8w_n-0C-GG-wW-W1_ywzvFQkfvCIZfN91WjLy5m0q6xccvPcb_ruoY2t9NeuxlBz8n0MBFC7IHlS_NF4xZHx2nsE4GrC5IZ6Vmq1KW1ZqZ81dWa0XMafBOnJmCzIwchi7nqg2T3uO4mzm1WYLzSjTTWMqH2sNAch3TetBLYo6YxjrQPsUqTyUBd0JojolQRCNUECvkKGSKeVIIycNTaC7KBZ4Bi-hN13lA_FKGPPG4jFTia0pqkBCfwuAcWiYKs-VWD2NWB-Di7-lbOMjGw8Fs0B89X8KhyYotBxFX0KxWG7wmUK_kjc3lJ7UBoiQ
openUrl	ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=proceeding&rft.title=2021+IEEE+CHILEAN+Conference+on+Electrical%2C+Electronics+Engineering%2C+Information+and+Communication+Technologies+%28CHILECON%29&rft.atitle=Adaptive+actor-critic+control+of+robots+with+integral+invariant+manifold&rft.au=Pantoja-Garcia%2C+Luis&rft.au=Garcia-Rodriguez%2C+Rodolfo&rft.au=Parra-Vega%2C+Vicente&rft.date=2021-12-06&rft.pub=IEEE&rft.spage=1&rft.epage=6&rft_id=info:doi/10.1109%2FCHILECON54041.2021.9703056&rft.externalDocID=9703056