Reinforcement Learning of Manipulation and Grasping Using Dynamical Movement Primitives for a Humanoidlike Mobile Manipulator

It is important for humanoid-like mobile robots to learn the complex motion sequences in human-robot environment such that the robots can adapt such motions. This paper describes a reinforcement learning (RL) strategy for manipulation and grasping of a mobile manipulator, which reduces the complexit...

Full description

Saved in:

Bibliographic Details
Published in	IEEE/ASME transactions on mechatronics Vol. 23; no. 1; pp. 121 - 131
Main Authors	Li, Zhijun, Zhao, Ting, Chen, Fei, Hu, Yingbai, Su, Chun-Yi, Fukuda, Toshio
Format	Journal Article
Language	English
Published	IEEE 01.02.2018
Subjects	Dynamic movement primitive (DMP) Learning (artificial intelligence) Manipulator dynamics Mobile communication mobile manipulation redundancy resolution reinforcement learning (RL) Robot sensing systems Trajectory
Online Access	Get full text
ISSN	1083-4435 1941-014X
DOI	10.1109/TMECH.2017.2717461

Cover

Loading…

Abstract	It is important for humanoid-like mobile robots to learn the complex motion sequences in human-robot environment such that the robots can adapt such motions. This paper describes a reinforcement learning (RL) strategy for manipulation and grasping of a mobile manipulator, which reduces the complexity of the visual feedback and handle varying manipulation dynamics and uncertain external perturbations. Two hierarchies plannings have been considered in the proposed strategy: 1) high-level online redundancy resolution based on the neural-dynamic optimization algorithm in operational space; and 2) low-level RL in joint space. At this level, the dynamic movement primitives have been considered to model and learn the joint trajectories, and then the RL is employed to learn the trajectories with uncertainties. Experimental results on the developed humanoidlike mobile robot demonstrate that the presented approach can suppress the uncertain external perturbations.
AbstractList	It is important for humanoid-like mobile robots to learn the complex motion sequences in human-robot environment such that the robots can adapt such motions. This paper describes a reinforcement learning (RL) strategy for manipulation and grasping of a mobile manipulator, which reduces the complexity of the visual feedback and handle varying manipulation dynamics and uncertain external perturbations. Two hierarchies plannings have been considered in the proposed strategy: 1) high-level online redundancy resolution based on the neural-dynamic optimization algorithm in operational space; and 2) low-level RL in joint space. At this level, the dynamic movement primitives have been considered to model and learn the joint trajectories, and then the RL is employed to learn the trajectories with uncertainties. Experimental results on the developed humanoidlike mobile robot demonstrate that the presented approach can suppress the uncertain external perturbations.
Author	Li, Zhijun Chen, Fei Hu, Yingbai Zhao, Ting Su, Chun-Yi Fukuda, Toshio
Author_xml	– sequence: 1 givenname: Zhijun orcidid: 0000-0002-3909-488X surname: Li fullname: Li, Zhijun email: zjli@ieee.org organization: Guangzhou, China – sequence: 2 givenname: Ting surname: Zhao fullname: Zhao, Ting email: zt20102011@163.com organization: Guangzhou, China – sequence: 3 givenname: Fei surname: Chen fullname: Chen, Fei email: fei.chen@iit.it organization: Genova, Italy – sequence: 4 givenname: Yingbai surname: Hu fullname: Hu, Yingbai email: 13249144573@163.com organization: Guangzhou, China – sequence: 5 givenname: Chun-Yi surname: Su fullname: Su, Chun-Yi email: chunyi.su@gmail.com organization: Montreal, QC, Canada – sequence: 6 givenname: Toshio surname: Fukuda fullname: Fukuda, Toshio email: tofukuda@nifty.com organization: Beijing, China
BookMark	eNp9kM9Kw0AQhxepYK2-gF72BVL3X7LJUWpthRZFWvAWJpuNrCa7ZTct9OC7m7RFwYOXmYGZ7wfzXaKBdVYjdEPJmFKS3a2W08l8zAiVYyapFAk9Q0OaCRoRKt4G3UxSHgnB4wt0GcIHIURQQofo61UbWzmvdKNtixcavDX2HbsKL8GazbaG1jiLwZZ45iFs-uU69PVhb6ExCmq8dLsj_uJNY1qz0wF3mRjwfNuAdaaszafuzgpT699c56_QeQV10NenPkLrx-lqMo8Wz7Onyf0iUpzzNkpFkpbASsoFqyQVLI6LQrAyE6XIlOSKSQaikgBxUsYKIOOaFkxK6HhSMD5C6TFXeReC11WuTHt4rPVg6pySvNeYHzTmvcb8pLFD2R900z0Jfv8_dHuEjNb6B5BZzJOM8W-G74L8
CODEN	IATEFW
CitedBy_id	crossref_primary_10_1109_TMECH_2021_3052900 crossref_primary_10_3390_s21072534 crossref_primary_10_1080_00207179_2022_2094836 crossref_primary_10_1109_TCDS_2019_2897618 crossref_primary_10_1109_TSMC_2020_3034757 crossref_primary_10_1002_aisy_202300269 crossref_primary_10_1007_s12652_021_03551_9 crossref_primary_10_1109_TII_2020_2971643 crossref_primary_10_1109_TASE_2020_2983225 crossref_primary_10_1109_ACCESS_2020_2976098 crossref_primary_10_3390_app11093919 crossref_primary_10_1109_TCDS_2018_2818173 crossref_primary_10_1109_TFUZZ_2021_3136933 crossref_primary_10_1109_TFUZZ_2022_3225660 crossref_primary_10_3390_app9081535 crossref_primary_10_1016_j_compag_2021_106037 crossref_primary_10_1109_TRO_2024_3390052 crossref_primary_10_1109_TMECH_2022_3209488 crossref_primary_10_1109_TCYB_2022_3192049 crossref_primary_10_3390_app9071335 crossref_primary_10_1109_ACCESS_2020_3027923 crossref_primary_10_1109_TMECH_2021_3110825 crossref_primary_10_1016_j_isatra_2024_01_007 crossref_primary_10_1115_1_4054611 crossref_primary_10_1007_s12393_024_09385_3 crossref_primary_10_1016_j_ifacol_2020_12_2197 crossref_primary_10_1016_j_robot_2023_104381 crossref_primary_10_1007_s40815_019_00719_6 crossref_primary_10_1109_TASE_2023_3320710 crossref_primary_10_1007_s11432_019_2748_x crossref_primary_10_1109_TIE_2021_3050371 crossref_primary_10_1109_TIE_2018_2884240 crossref_primary_10_17587_mau_21_470_479 crossref_primary_10_1017_S0263574721001922 crossref_primary_10_3390_machines11030350 crossref_primary_10_1109_LRA_2022_3152974 crossref_primary_10_1109_TETCI_2022_3146387 crossref_primary_10_1109_TMECH_2021_3056095 crossref_primary_10_1007_s13369_024_09592_4 crossref_primary_10_1016_j_neucom_2022_09_114 crossref_primary_10_1109_ACCESS_2020_3028740 crossref_primary_10_1109_TMECH_2018_2817589 crossref_primary_10_3390_app13042028 crossref_primary_10_1016_j_jfranklin_2024_106773 crossref_primary_10_1016_j_rcim_2024_102903 crossref_primary_10_1109_TMECH_2019_2942715 crossref_primary_10_1007_s11633_021_1311_2 crossref_primary_10_1016_j_cie_2024_110106 crossref_primary_10_1109_ACCESS_2019_2945484 crossref_primary_10_1109_TCDS_2018_2874309 crossref_primary_10_1109_TCYB_2021_3064865 crossref_primary_10_1007_s12530_019_09290_9 crossref_primary_10_1109_LRA_2021_3058874 crossref_primary_10_1109_TCDS_2021_3094269 crossref_primary_10_1007_s10462_022_10257_7 crossref_primary_10_1007_s10489_022_03219_7 crossref_primary_10_1007_s11633_022_1346_z crossref_primary_10_1109_TASE_2024_3458998 crossref_primary_10_1016_j_measurement_2024_116065 crossref_primary_10_1109_TASE_2019_2911667 crossref_primary_10_3390_s19173636 crossref_primary_10_1007_s10462_021_09997_9 crossref_primary_10_3390_s20030939 crossref_primary_10_1016_j_neucom_2019_07_104 crossref_primary_10_1109_TMECH_2021_3072675 crossref_primary_10_1177_17298814211007305 crossref_primary_10_5194_ms_12_735_2021 crossref_primary_10_3389_fnbot_2024_1453571 crossref_primary_10_1016_j_actaastro_2022_11_043 crossref_primary_10_1109_TCYB_2024_3390947 crossref_primary_10_1177_02783649231201196 crossref_primary_10_1109_TNNLS_2019_2955438 crossref_primary_10_1109_TII_2018_2849348 crossref_primary_10_1109_TII_2018_2826064 crossref_primary_10_1109_TMECH_2020_2987004 crossref_primary_10_1016_j_neucom_2022_11_076 crossref_primary_10_1016_j_eswa_2023_121085 crossref_primary_10_1109_TSMC_2019_2901277 crossref_primary_10_1177_0142331221995336 crossref_primary_10_1109_TII_2020_2984482 crossref_primary_10_1177_1729881419831846 crossref_primary_10_1109_TMECH_2023_3287635 crossref_primary_10_1088_1361_665X_ab4b84 crossref_primary_10_1016_j_neunet_2020_07_033 crossref_primary_10_1177_09544062211014916 crossref_primary_10_1016_j_robot_2020_103515 crossref_primary_10_1002_aisy_202400068 crossref_primary_10_1109_TCYB_2020_2978003 crossref_primary_10_3390_s21041278 crossref_primary_10_1109_TCSS_2024_3412911 crossref_primary_10_1016_j_robot_2022_104046 crossref_primary_10_1016_j_ins_2023_119700 crossref_primary_10_1109_MRA_2023_3276266 crossref_primary_10_1016_j_inffus_2024_102379 crossref_primary_10_1109_TII_2021_3087337 crossref_primary_10_1109_TCDS_2018_2866477 crossref_primary_10_1007_s00170_022_09438_z crossref_primary_10_1177_1729881419830204 crossref_primary_10_1109_TMECH_2021_3057022 crossref_primary_10_1007_s00521_021_06449_x crossref_primary_10_1109_TIE_2021_3070508 crossref_primary_10_1016_j_mechatronics_2021_102609 crossref_primary_10_1016_j_robot_2020_103668 crossref_primary_10_1108_IR_11_2020_0255 crossref_primary_10_1109_TASE_2018_2880245 crossref_primary_10_1109_TNNLS_2020_3006850 crossref_primary_10_1109_TRO_2022_3152685 crossref_primary_10_23919_JSEE_2024_000038 crossref_primary_10_1016_j_robot_2023_104445 crossref_primary_10_1109_TIE_2019_2916396 crossref_primary_10_1002_rnc_6679 crossref_primary_10_1007_s12369_025_01232_7 crossref_primary_10_1109_LRA_2022_3153728 crossref_primary_10_1177_09544062231181811 crossref_primary_10_1109_TSMC_2019_2947453 crossref_primary_10_1016_j_procs_2024_04_028 crossref_primary_10_1007_s41315_021_00194_z
Cites_doi	10.1109/TRO.2012.2210294 10.1002/rob.21566 10.1098/rstb.2002.1258 10.1109/TSMCB.2009.2026289 10.1007/s10514-009-9151-x 10.1007/s10514-009-9152-9 10.1371/journal.pone.0129281 10.1109/TSMC.2016.2560530 10.1177/0278364904042199 10.1177/027836498400300106 10.1016/j.rcim.2016.01.004 10.1162/NECO_a_00393 10.1016/S0921-8890(02)00165-3 10.1016/j.robot.2004.03.004 10.1007/s10514-015-9440-5 10.1109/TRA.2004.824946 10.1007/978-3-319-22879-2_52 10.1016/j.neunet.2003.11.009 10.1109/TMECH.2016.2523602 10.1109/31.1783 10.1109/TSMC.2015.2422267 10.1017/S0263574714000198 10.1109/ICIECS.2010.5678343 10.1109/21.467718 10.1007/s11042-016-3275-8 10.1016/j.robot.2012.09.012 10.1109/TCST.2008.917870 10.1177/0278364907084980 10.1109/TMECH.2016.2551557 10.1109/ICIP.1998.723460 10.1109/TSMCB.2004.830347 10.1109/TMECH.2016.2567453 10.1109/TIE.2005.855696
ContentType	Journal Article
DBID	97E RIA RIE AAYXX CITATION
DOI	10.1109/TMECH.2017.2717461
DatabaseName	IEEE All-Society Periodicals Package (ASPP) 2005-present IEEE All-Society Periodicals Package (ASPP) 1998-Present IEEE Electronic Library (IEL) CrossRef
DatabaseTitle	CrossRef
DatabaseTitleList
Database_xml	– sequence: 1 dbid: RIE name: IEEE Electronic Library (IEL) url: https://proxy.k.utb.cz/login?url=https://ieeexplore.ieee.org/ sourceTypes: Publisher
DeliveryMethod	fulltext_linktorsrc
Discipline	Engineering
EISSN	1941-014X
EndPage	131
ExternalDocumentID	10_1109_TMECH_2017_2717461 7953692
Genre	orig-research
GroupedDBID	-~X 0R~ 29I 4.4 5GY 5VS 6IK 97E 9M8 AAJGR AARMG AASAJ AAWTH ABAZT ABQJQ ABVLG ACGFS ACIWK ACKIV AETIX AGQYO AGSQL AHBIQ AI. AIBXA AKJIK AKQYR ALLEH ALMA_UNASSIGNED_HOLDINGS ATWAV BEFXN BFFAM BGNUA BKEBE BPEOZ CS3 EBS EJD F5P H~9 IFIPE IFJZH IPLJI JAVBF LAI M43 OCL RIA RIE RNS TN5 VH1 AAYXX CITATION RIG
ID	FETCH-LOGICAL-c333t-8468da2d1342f714255bb42d94d49c73c272a4f7aa56d5caa93e1b277ac330b23
IEDL.DBID	RIE
ISSN	1083-4435
IngestDate	Tue Jul 01 04:23:14 EDT 2025 Thu Apr 24 23:00:07 EDT 2025 Wed Aug 27 02:52:34 EDT 2025
IsPeerReviewed	true
IsScholarly	true
Issue	1
Language	English
License	https://ieeexplore.ieee.org/Xplorehelp/downloads/license-information/IEEE.html
LinkModel	DirectLink
MergedId	FETCHMERGED-LOGICAL-c333t-8468da2d1342f714255bb42d94d49c73c272a4f7aa56d5caa93e1b277ac330b23
ORCID	0000-0002-3909-488X
PageCount	11
ParticipantIDs	crossref_citationtrail_10_1109_TMECH_2017_2717461 crossref_primary_10_1109_TMECH_2017_2717461 ieee_primary_7953692
ProviderPackageCode	CITATION AAYXX
PublicationCentury	2000
PublicationDate	2018-Feb. 2018-2-00
PublicationDateYYYYMMDD	2018-02-01
PublicationDate_xml	– month: 02 year: 2018 text: 2018-Feb.
PublicationDecade	2010
PublicationTitle	IEEE/ASME transactions on mechatronics
PublicationTitleAbbrev	TMECH
PublicationYear	2018
Publisher	IEEE
Publisher_xml	– name: IEEE
References	ref35 ref13 ref34 ref12 ref15 ref36 ref14 ref31 ref30 ref33 ref11 ref32 ref10 ref2 ref1 ref16 ref19 ref18 theodorou (ref17) 2010; 11 ref24 ref23 ref26 duguleana (ref29) 2012; 28 ref25 ref20 ref22 ref21 theodorou (ref37) 2010; 11 ref8 ref7 ref9 ref4 kim (ref28) 2010; 40 siciliano (ref27) 2009 ref3 ref6 ref5
References_xml	– ident: ref1 doi: 10.1109/TRO.2012.2210294 – ident: ref6 doi: 10.1002/rob.21566 – volume: 11 start-page: 3137 year: 2010 ident: ref37 article-title: A generalized path integral control approach to reinforcement learning publication-title: J Mach Learn Res – ident: ref18 doi: 10.1098/rstb.2002.1258 – volume: 40 start-page: 433 year: 2010 ident: ref28 article-title: Impedance learning for robotic contact tasks using natural actor-critic algorithm publication-title: IEEE Trans Systems Man Cybern B doi: 10.1109/TSMCB.2009.2026289 – ident: ref9 doi: 10.1007/s10514-009-9151-x – ident: ref8 doi: 10.1007/s10514-009-9152-9 – ident: ref24 doi: 10.1371/journal.pone.0129281 – volume: 28 start-page: 132 year: 2012 ident: ref29 article-title: Obstacle avoidance of redundant manipulators using neural networks based reinforcement learning publication-title: Robot Comput -Integr Manuf – ident: ref36 doi: 10.1109/TSMC.2016.2560530 – year: 2009 ident: ref27 publication-title: Robotics Modelling Planning and Control – ident: ref14 doi: 10.1177/0278364904042199 – ident: ref26 doi: 10.1177/027836498400300106 – ident: ref16 doi: 10.1016/j.rcim.2016.01.004 – ident: ref13 doi: 10.1162/NECO_a_00393 – ident: ref31 doi: 10.1016/S0921-8890(02)00165-3 – ident: ref11 doi: 10.1016/j.robot.2004.03.004 – ident: ref7 doi: 10.1007/s10514-015-9440-5 – ident: ref22 doi: 10.1109/TRA.2004.824946 – ident: ref21 doi: 10.1007/978-3-319-22879-2_52 – ident: ref12 doi: 10.1016/j.neunet.2003.11.009 – ident: ref3 doi: 10.1109/TMECH.2016.2523602 – ident: ref33 doi: 10.1109/31.1783 – ident: ref35 doi: 10.1109/TSMC.2015.2422267 – ident: ref5 doi: 10.1017/S0263574714000198 – ident: ref23 doi: 10.1109/ICIECS.2010.5678343 – ident: ref34 doi: 10.1109/21.467718 – ident: ref25 doi: 10.1007/s11042-016-3275-8 – ident: ref30 doi: 10.1016/j.robot.2012.09.012 – ident: ref10 doi: 10.1109/TCST.2008.917870 – ident: ref32 doi: 10.1177/0278364907084980 – volume: 11 start-page: 3137 year: 2010 ident: ref17 article-title: A generalized path integral control approach to reinforcement learning publication-title: J Mach Learn Res – ident: ref4 doi: 10.1109/TMECH.2016.2551557 – ident: ref19 doi: 10.1109/ICIP.1998.723460 – ident: ref15 doi: 10.1109/TSMCB.2004.830347 – ident: ref2 doi: 10.1109/TMECH.2016.2567453 – ident: ref20 doi: 10.1109/TIE.2005.855696
SSID	ssj0004101
Score	2.5914404
Snippet	It is important for humanoid-like mobile robots to learn the complex motion sequences in human-robot environment such that the robots can adapt such motions....
SourceID	crossref ieee
SourceType	Enrichment Source Index Database Publisher
StartPage	121
SubjectTerms	Dynamic movement primitive (DMP) Learning (artificial intelligence) Manipulator dynamics Mobile communication mobile manipulation redundancy resolution reinforcement learning (RL) Robot sensing systems Trajectory
Title	Reinforcement Learning of Manipulation and Grasping Using Dynamical Movement Primitives for a Humanoidlike Mobile Manipulator
URI	https://ieeexplore.ieee.org/document/7953692
Volume	23
hasFullText	1
inHoldings	1
isFullTextHit
isPrint
link	http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV1LS8QwEA7qSQ--VnF9kYM37dq8NtujqOsiVERW2FvJa2VxaWUfHgT_u5O0XR-IeCmlZCaBL01mkpn5EDqR4PvHytFIGyEizuGhiWNRIpkeMhEPpfIJzuldu_fIbwdisITOFrkwzrkQfOZa_jXc5dvCzP1R2bn0d40JLLjL4LiVuVqfOZAkUB0TMCmgSybqBJk4Oe-n15c9H8UlWxS8F94m3zahL6wqYVPpbqC0Hk4ZS_Lcms90y7z9qNT43_FuovXKusQX5XTYQksu30ZrX2oONtD7gwvFUk04F8RVfdUnXAxxqvJRTeeFVW7xzURNfT4VDoEF-Kpkr4cO0uK1FL_3rGB-xZxi0IkVDpcCxciOR88OmmlYdD71FpMd9Ni97l_2ooqDITKMsVkE5knHKmoJ43QoCfzhQmtObcItT4xkhkqqOCCqRNsKo1TCHNFUSgXysaZsF63kRe72EJZGO2oYMY6BbEco7qTToNRa0CHjJiI1KJmpCpR7noxxFhyVOMkCkJkHMquAbKLThcxLWZ7jz9YND9KiZYXP_u-fD9AqCHfKEO1DtDKbzN0RWCAzfRym3geKvdlS
linkProvider	IEEE
linkToHtml	http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV1JS8QwFA6iB_XgLu7m4E07NttkepRxGZeKyAjeSraRYaSVWTwI_ndf0s64IOKllPLeS-BLk5fkvfchdCBh7x8rRyNthIg4h4cmjkWJZLrDRNyRyic4p7f11gO_ehSPU-hokgvjnAvBZ67mX8Ndvi3MyB-VHUt_15jAhDsD674gZbbWZxYkCWTHBJwKaJSJcYpMnBy307Nmy8dxyRqF_Quvk2_L0BdelbCsnC-idNyhMpqkVxsNdc28_ajV-N8eL6GFyr_EJ-WAWEZTLl9B81-qDq6i93sXyqWacDKIqwqrT7jo4FTl3TGhF1a5xRd9NfAZVTiEFuDTkr8eGkiL11L9zvOC-TlzgMEmVjhcCxRd-9ztORDTMO182i36a-jh_KzdbEUVC0NkGGPDCByUhlXUEsZpRxL4x4XWnNqEW54YyQyVVHHAVIm6FUaphDmiqZQK9GNN2TqazovcbSAsjXbUMGIcA92GUNxJp8GotWBDxpuIjEHJTFWi3DNlPGdhqxInWQAy80BmFZCb6HCi81IW6PhTetWDNJGs8Nn6_fM-mm2105vs5vL2ehvNgaFGGbC9g6aH_ZHbBX9kqPfCMPwAB-7cmw
openUrl	ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Reinforcement+Learning+of+Manipulation+and+Grasping+Using+Dynamical+Movement+Primitives+for+a+Humanoidlike+Mobile+Manipulator&rft.jtitle=IEEE%2FASME+transactions+on+mechatronics&rft.au=Li%2C+Zhijun&rft.au=Zhao%2C+Ting&rft.au=Chen%2C+Fei&rft.au=Hu%2C+Yingbai&rft.date=2018-02-01&rft.pub=IEEE&rft.issn=1083-4435&rft.volume=23&rft.issue=1&rft.spage=121&rft.epage=131&rft_id=info:doi/10.1109%2FTMECH.2017.2717461&rft.externalDocID=7953692
thumbnail_l	http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=1083-4435&client=summon
thumbnail_m	http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=1083-4435&client=summon
thumbnail_s	http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=1083-4435&client=summon