Reinforcement Learning of Manipulation and Grasping Using Dynamical Movement Primitives for a Humanoidlike Mobile Manipulator

It is important for humanoid-like mobile robots to learn the complex motion sequences in human-robot environment such that the robots can adapt such motions. This paper describes a reinforcement learning (RL) strategy for manipulation and grasping of a mobile manipulator, which reduces the complexit...

Full description

Saved in:
Bibliographic Details
Published inIEEE/ASME transactions on mechatronics Vol. 23; no. 1; pp. 121 - 131
Main Authors Li, Zhijun, Zhao, Ting, Chen, Fei, Hu, Yingbai, Su, Chun-Yi, Fukuda, Toshio
Format Journal Article
LanguageEnglish
Published IEEE 01.02.2018
Subjects
Online AccessGet full text
ISSN1083-4435
1941-014X
DOI10.1109/TMECH.2017.2717461

Cover

Loading…
Abstract It is important for humanoid-like mobile robots to learn the complex motion sequences in human-robot environment such that the robots can adapt such motions. This paper describes a reinforcement learning (RL) strategy for manipulation and grasping of a mobile manipulator, which reduces the complexity of the visual feedback and handle varying manipulation dynamics and uncertain external perturbations. Two hierarchies plannings have been considered in the proposed strategy: 1) high-level online redundancy resolution based on the neural-dynamic optimization algorithm in operational space; and 2) low-level RL in joint space. At this level, the dynamic movement primitives have been considered to model and learn the joint trajectories, and then the RL is employed to learn the trajectories with uncertainties. Experimental results on the developed humanoidlike mobile robot demonstrate that the presented approach can suppress the uncertain external perturbations.
AbstractList It is important for humanoid-like mobile robots to learn the complex motion sequences in human-robot environment such that the robots can adapt such motions. This paper describes a reinforcement learning (RL) strategy for manipulation and grasping of a mobile manipulator, which reduces the complexity of the visual feedback and handle varying manipulation dynamics and uncertain external perturbations. Two hierarchies plannings have been considered in the proposed strategy: 1) high-level online redundancy resolution based on the neural-dynamic optimization algorithm in operational space; and 2) low-level RL in joint space. At this level, the dynamic movement primitives have been considered to model and learn the joint trajectories, and then the RL is employed to learn the trajectories with uncertainties. Experimental results on the developed humanoidlike mobile robot demonstrate that the presented approach can suppress the uncertain external perturbations.
Author Li, Zhijun
Chen, Fei
Hu, Yingbai
Zhao, Ting
Su, Chun-Yi
Fukuda, Toshio
Author_xml – sequence: 1
  givenname: Zhijun
  orcidid: 0000-0002-3909-488X
  surname: Li
  fullname: Li, Zhijun
  email: zjli@ieee.org
  organization: Guangzhou, China
– sequence: 2
  givenname: Ting
  surname: Zhao
  fullname: Zhao, Ting
  email: zt20102011@163.com
  organization: Guangzhou, China
– sequence: 3
  givenname: Fei
  surname: Chen
  fullname: Chen, Fei
  email: fei.chen@iit.it
  organization: Genova, Italy
– sequence: 4
  givenname: Yingbai
  surname: Hu
  fullname: Hu, Yingbai
  email: 13249144573@163.com
  organization: Guangzhou, China
– sequence: 5
  givenname: Chun-Yi
  surname: Su
  fullname: Su, Chun-Yi
  email: chunyi.su@gmail.com
  organization: Montreal, QC, Canada
– sequence: 6
  givenname: Toshio
  surname: Fukuda
  fullname: Fukuda, Toshio
  email: tofukuda@nifty.com
  organization: Beijing, China
BookMark eNp9kM9Kw0AQhxepYK2-gF72BVL3X7LJUWpthRZFWvAWJpuNrCa7ZTct9OC7m7RFwYOXmYGZ7wfzXaKBdVYjdEPJmFKS3a2W08l8zAiVYyapFAk9Q0OaCRoRKt4G3UxSHgnB4wt0GcIHIURQQofo61UbWzmvdKNtixcavDX2HbsKL8GazbaG1jiLwZZ45iFs-uU69PVhb6ExCmq8dLsj_uJNY1qz0wF3mRjwfNuAdaaszafuzgpT699c56_QeQV10NenPkLrx-lqMo8Wz7Onyf0iUpzzNkpFkpbASsoFqyQVLI6LQrAyE6XIlOSKSQaikgBxUsYKIOOaFkxK6HhSMD5C6TFXeReC11WuTHt4rPVg6pySvNeYHzTmvcb8pLFD2R900z0Jfv8_dHuEjNb6B5BZzJOM8W-G74L8
CODEN IATEFW
CitedBy_id crossref_primary_10_1109_TMECH_2021_3052900
crossref_primary_10_3390_s21072534
crossref_primary_10_1080_00207179_2022_2094836
crossref_primary_10_1109_TCDS_2019_2897618
crossref_primary_10_1109_TSMC_2020_3034757
crossref_primary_10_1002_aisy_202300269
crossref_primary_10_1007_s12652_021_03551_9
crossref_primary_10_1109_TII_2020_2971643
crossref_primary_10_1109_TASE_2020_2983225
crossref_primary_10_1109_ACCESS_2020_2976098
crossref_primary_10_3390_app11093919
crossref_primary_10_1109_TCDS_2018_2818173
crossref_primary_10_1109_TFUZZ_2021_3136933
crossref_primary_10_1109_TFUZZ_2022_3225660
crossref_primary_10_3390_app9081535
crossref_primary_10_1016_j_compag_2021_106037
crossref_primary_10_1109_TRO_2024_3390052
crossref_primary_10_1109_TMECH_2022_3209488
crossref_primary_10_1109_TCYB_2022_3192049
crossref_primary_10_3390_app9071335
crossref_primary_10_1109_ACCESS_2020_3027923
crossref_primary_10_1109_TMECH_2021_3110825
crossref_primary_10_1016_j_isatra_2024_01_007
crossref_primary_10_1115_1_4054611
crossref_primary_10_1007_s12393_024_09385_3
crossref_primary_10_1016_j_ifacol_2020_12_2197
crossref_primary_10_1016_j_robot_2023_104381
crossref_primary_10_1007_s40815_019_00719_6
crossref_primary_10_1109_TASE_2023_3320710
crossref_primary_10_1007_s11432_019_2748_x
crossref_primary_10_1109_TIE_2021_3050371
crossref_primary_10_1109_TIE_2018_2884240
crossref_primary_10_17587_mau_21_470_479
crossref_primary_10_1017_S0263574721001922
crossref_primary_10_3390_machines11030350
crossref_primary_10_1109_LRA_2022_3152974
crossref_primary_10_1109_TETCI_2022_3146387
crossref_primary_10_1109_TMECH_2021_3056095
crossref_primary_10_1007_s13369_024_09592_4
crossref_primary_10_1016_j_neucom_2022_09_114
crossref_primary_10_1109_ACCESS_2020_3028740
crossref_primary_10_1109_TMECH_2018_2817589
crossref_primary_10_3390_app13042028
crossref_primary_10_1016_j_jfranklin_2024_106773
crossref_primary_10_1016_j_rcim_2024_102903
crossref_primary_10_1109_TMECH_2019_2942715
crossref_primary_10_1007_s11633_021_1311_2
crossref_primary_10_1016_j_cie_2024_110106
crossref_primary_10_1109_ACCESS_2019_2945484
crossref_primary_10_1109_TCDS_2018_2874309
crossref_primary_10_1109_TCYB_2021_3064865
crossref_primary_10_1007_s12530_019_09290_9
crossref_primary_10_1109_LRA_2021_3058874
crossref_primary_10_1109_TCDS_2021_3094269
crossref_primary_10_1007_s10462_022_10257_7
crossref_primary_10_1007_s10489_022_03219_7
crossref_primary_10_1007_s11633_022_1346_z
crossref_primary_10_1109_TASE_2024_3458998
crossref_primary_10_1016_j_measurement_2024_116065
crossref_primary_10_1109_TASE_2019_2911667
crossref_primary_10_3390_s19173636
crossref_primary_10_1007_s10462_021_09997_9
crossref_primary_10_3390_s20030939
crossref_primary_10_1016_j_neucom_2019_07_104
crossref_primary_10_1109_TMECH_2021_3072675
crossref_primary_10_1177_17298814211007305
crossref_primary_10_5194_ms_12_735_2021
crossref_primary_10_3389_fnbot_2024_1453571
crossref_primary_10_1016_j_actaastro_2022_11_043
crossref_primary_10_1109_TCYB_2024_3390947
crossref_primary_10_1177_02783649231201196
crossref_primary_10_1109_TNNLS_2019_2955438
crossref_primary_10_1109_TII_2018_2849348
crossref_primary_10_1109_TII_2018_2826064
crossref_primary_10_1109_TMECH_2020_2987004
crossref_primary_10_1016_j_neucom_2022_11_076
crossref_primary_10_1016_j_eswa_2023_121085
crossref_primary_10_1109_TSMC_2019_2901277
crossref_primary_10_1177_0142331221995336
crossref_primary_10_1109_TII_2020_2984482
crossref_primary_10_1177_1729881419831846
crossref_primary_10_1109_TMECH_2023_3287635
crossref_primary_10_1088_1361_665X_ab4b84
crossref_primary_10_1016_j_neunet_2020_07_033
crossref_primary_10_1177_09544062211014916
crossref_primary_10_1016_j_robot_2020_103515
crossref_primary_10_1002_aisy_202400068
crossref_primary_10_1109_TCYB_2020_2978003
crossref_primary_10_3390_s21041278
crossref_primary_10_1109_TCSS_2024_3412911
crossref_primary_10_1016_j_robot_2022_104046
crossref_primary_10_1016_j_ins_2023_119700
crossref_primary_10_1109_MRA_2023_3276266
crossref_primary_10_1016_j_inffus_2024_102379
crossref_primary_10_1109_TII_2021_3087337
crossref_primary_10_1109_TCDS_2018_2866477
crossref_primary_10_1007_s00170_022_09438_z
crossref_primary_10_1177_1729881419830204
crossref_primary_10_1109_TMECH_2021_3057022
crossref_primary_10_1007_s00521_021_06449_x
crossref_primary_10_1109_TIE_2021_3070508
crossref_primary_10_1016_j_mechatronics_2021_102609
crossref_primary_10_1016_j_robot_2020_103668
crossref_primary_10_1108_IR_11_2020_0255
crossref_primary_10_1109_TASE_2018_2880245
crossref_primary_10_1109_TNNLS_2020_3006850
crossref_primary_10_1109_TRO_2022_3152685
crossref_primary_10_23919_JSEE_2024_000038
crossref_primary_10_1016_j_robot_2023_104445
crossref_primary_10_1109_TIE_2019_2916396
crossref_primary_10_1002_rnc_6679
crossref_primary_10_1007_s12369_025_01232_7
crossref_primary_10_1109_LRA_2022_3153728
crossref_primary_10_1177_09544062231181811
crossref_primary_10_1109_TSMC_2019_2947453
crossref_primary_10_1016_j_procs_2024_04_028
crossref_primary_10_1007_s41315_021_00194_z
Cites_doi 10.1109/TRO.2012.2210294
10.1002/rob.21566
10.1098/rstb.2002.1258
10.1109/TSMCB.2009.2026289
10.1007/s10514-009-9151-x
10.1007/s10514-009-9152-9
10.1371/journal.pone.0129281
10.1109/TSMC.2016.2560530
10.1177/0278364904042199
10.1177/027836498400300106
10.1016/j.rcim.2016.01.004
10.1162/NECO_a_00393
10.1016/S0921-8890(02)00165-3
10.1016/j.robot.2004.03.004
10.1007/s10514-015-9440-5
10.1109/TRA.2004.824946
10.1007/978-3-319-22879-2_52
10.1016/j.neunet.2003.11.009
10.1109/TMECH.2016.2523602
10.1109/31.1783
10.1109/TSMC.2015.2422267
10.1017/S0263574714000198
10.1109/ICIECS.2010.5678343
10.1109/21.467718
10.1007/s11042-016-3275-8
10.1016/j.robot.2012.09.012
10.1109/TCST.2008.917870
10.1177/0278364907084980
10.1109/TMECH.2016.2551557
10.1109/ICIP.1998.723460
10.1109/TSMCB.2004.830347
10.1109/TMECH.2016.2567453
10.1109/TIE.2005.855696
ContentType Journal Article
DBID 97E
RIA
RIE
AAYXX
CITATION
DOI 10.1109/TMECH.2017.2717461
DatabaseName IEEE All-Society Periodicals Package (ASPP) 2005-present
IEEE All-Society Periodicals Package (ASPP) 1998-Present
IEEE Electronic Library (IEL)
CrossRef
DatabaseTitle CrossRef
DatabaseTitleList
Database_xml – sequence: 1
  dbid: RIE
  name: IEEE Electronic Library (IEL)
  url: https://proxy.k.utb.cz/login?url=https://ieeexplore.ieee.org/
  sourceTypes: Publisher
DeliveryMethod fulltext_linktorsrc
Discipline Engineering
EISSN 1941-014X
EndPage 131
ExternalDocumentID 10_1109_TMECH_2017_2717461
7953692
Genre orig-research
GroupedDBID -~X
0R~
29I
4.4
5GY
5VS
6IK
97E
9M8
AAJGR
AARMG
AASAJ
AAWTH
ABAZT
ABQJQ
ABVLG
ACGFS
ACIWK
ACKIV
AETIX
AGQYO
AGSQL
AHBIQ
AI.
AIBXA
AKJIK
AKQYR
ALLEH
ALMA_UNASSIGNED_HOLDINGS
ATWAV
BEFXN
BFFAM
BGNUA
BKEBE
BPEOZ
CS3
EBS
EJD
F5P
H~9
IFIPE
IFJZH
IPLJI
JAVBF
LAI
M43
OCL
RIA
RIE
RNS
TN5
VH1
AAYXX
CITATION
RIG
ID FETCH-LOGICAL-c333t-8468da2d1342f714255bb42d94d49c73c272a4f7aa56d5caa93e1b277ac330b23
IEDL.DBID RIE
ISSN 1083-4435
IngestDate Tue Jul 01 04:23:14 EDT 2025
Thu Apr 24 23:00:07 EDT 2025
Wed Aug 27 02:52:34 EDT 2025
IsPeerReviewed true
IsScholarly true
Issue 1
Language English
License https://ieeexplore.ieee.org/Xplorehelp/downloads/license-information/IEEE.html
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-c333t-8468da2d1342f714255bb42d94d49c73c272a4f7aa56d5caa93e1b277ac330b23
ORCID 0000-0002-3909-488X
PageCount 11
ParticipantIDs crossref_citationtrail_10_1109_TMECH_2017_2717461
crossref_primary_10_1109_TMECH_2017_2717461
ieee_primary_7953692
ProviderPackageCode CITATION
AAYXX
PublicationCentury 2000
PublicationDate 2018-Feb.
2018-2-00
PublicationDateYYYYMMDD 2018-02-01
PublicationDate_xml – month: 02
  year: 2018
  text: 2018-Feb.
PublicationDecade 2010
PublicationTitle IEEE/ASME transactions on mechatronics
PublicationTitleAbbrev TMECH
PublicationYear 2018
Publisher IEEE
Publisher_xml – name: IEEE
References ref35
ref13
ref34
ref12
ref15
ref36
ref14
ref31
ref30
ref33
ref11
ref32
ref10
ref2
ref1
ref16
ref19
ref18
theodorou (ref17) 2010; 11
ref24
ref23
ref26
duguleana (ref29) 2012; 28
ref25
ref20
ref22
ref21
theodorou (ref37) 2010; 11
ref8
ref7
ref9
ref4
kim (ref28) 2010; 40
siciliano (ref27) 2009
ref3
ref6
ref5
References_xml – ident: ref1
  doi: 10.1109/TRO.2012.2210294
– ident: ref6
  doi: 10.1002/rob.21566
– volume: 11
  start-page: 3137
  year: 2010
  ident: ref37
  article-title: A generalized path integral control approach to reinforcement learning
  publication-title: J Mach Learn Res
– ident: ref18
  doi: 10.1098/rstb.2002.1258
– volume: 40
  start-page: 433
  year: 2010
  ident: ref28
  article-title: Impedance learning for robotic contact tasks using natural actor-critic algorithm
  publication-title: IEEE Trans Systems Man Cybern B
  doi: 10.1109/TSMCB.2009.2026289
– ident: ref9
  doi: 10.1007/s10514-009-9151-x
– ident: ref8
  doi: 10.1007/s10514-009-9152-9
– ident: ref24
  doi: 10.1371/journal.pone.0129281
– volume: 28
  start-page: 132
  year: 2012
  ident: ref29
  article-title: Obstacle avoidance of redundant manipulators using neural networks based reinforcement learning
  publication-title: Robot Comput -Integr Manuf
– ident: ref36
  doi: 10.1109/TSMC.2016.2560530
– year: 2009
  ident: ref27
  publication-title: Robotics Modelling Planning and Control
– ident: ref14
  doi: 10.1177/0278364904042199
– ident: ref26
  doi: 10.1177/027836498400300106
– ident: ref16
  doi: 10.1016/j.rcim.2016.01.004
– ident: ref13
  doi: 10.1162/NECO_a_00393
– ident: ref31
  doi: 10.1016/S0921-8890(02)00165-3
– ident: ref11
  doi: 10.1016/j.robot.2004.03.004
– ident: ref7
  doi: 10.1007/s10514-015-9440-5
– ident: ref22
  doi: 10.1109/TRA.2004.824946
– ident: ref21
  doi: 10.1007/978-3-319-22879-2_52
– ident: ref12
  doi: 10.1016/j.neunet.2003.11.009
– ident: ref3
  doi: 10.1109/TMECH.2016.2523602
– ident: ref33
  doi: 10.1109/31.1783
– ident: ref35
  doi: 10.1109/TSMC.2015.2422267
– ident: ref5
  doi: 10.1017/S0263574714000198
– ident: ref23
  doi: 10.1109/ICIECS.2010.5678343
– ident: ref34
  doi: 10.1109/21.467718
– ident: ref25
  doi: 10.1007/s11042-016-3275-8
– ident: ref30
  doi: 10.1016/j.robot.2012.09.012
– ident: ref10
  doi: 10.1109/TCST.2008.917870
– ident: ref32
  doi: 10.1177/0278364907084980
– volume: 11
  start-page: 3137
  year: 2010
  ident: ref17
  article-title: A generalized path integral control approach to reinforcement learning
  publication-title: J Mach Learn Res
– ident: ref4
  doi: 10.1109/TMECH.2016.2551557
– ident: ref19
  doi: 10.1109/ICIP.1998.723460
– ident: ref15
  doi: 10.1109/TSMCB.2004.830347
– ident: ref2
  doi: 10.1109/TMECH.2016.2567453
– ident: ref20
  doi: 10.1109/TIE.2005.855696
SSID ssj0004101
Score 2.5914404
Snippet It is important for humanoid-like mobile robots to learn the complex motion sequences in human-robot environment such that the robots can adapt such motions....
SourceID crossref
ieee
SourceType Enrichment Source
Index Database
Publisher
StartPage 121
SubjectTerms Dynamic movement primitive (DMP)
Learning (artificial intelligence)
Manipulator dynamics
Mobile communication
mobile manipulation
redundancy resolution
reinforcement learning (RL)
Robot sensing systems
Trajectory
Title Reinforcement Learning of Manipulation and Grasping Using Dynamical Movement Primitives for a Humanoidlike Mobile Manipulator
URI https://ieeexplore.ieee.org/document/7953692
Volume 23
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
link http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV1LS8QwEA7qSQ--VnF9kYM37dq8NtujqOsiVERW2FvJa2VxaWUfHgT_u5O0XR-IeCmlZCaBL01mkpn5EDqR4PvHytFIGyEizuGhiWNRIpkeMhEPpfIJzuldu_fIbwdisITOFrkwzrkQfOZa_jXc5dvCzP1R2bn0d40JLLjL4LiVuVqfOZAkUB0TMCmgSybqBJk4Oe-n15c9H8UlWxS8F94m3zahL6wqYVPpbqC0Hk4ZS_Lcms90y7z9qNT43_FuovXKusQX5XTYQksu30ZrX2oONtD7gwvFUk04F8RVfdUnXAxxqvJRTeeFVW7xzURNfT4VDoEF-Kpkr4cO0uK1FL_3rGB-xZxi0IkVDpcCxciOR88OmmlYdD71FpMd9Ni97l_2ooqDITKMsVkE5knHKmoJ43QoCfzhQmtObcItT4xkhkqqOCCqRNsKo1TCHNFUSgXysaZsF63kRe72EJZGO2oYMY6BbEco7qTToNRa0CHjJiI1KJmpCpR7noxxFhyVOMkCkJkHMquAbKLThcxLWZ7jz9YND9KiZYXP_u-fD9AqCHfKEO1DtDKbzN0RWCAzfRym3geKvdlS
linkProvider IEEE
linkToHtml http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV1JS8QwFA6iB_XgLu7m4E07NttkepRxGZeKyAjeSraRYaSVWTwI_ndf0s64IOKllPLeS-BLk5fkvfchdCBh7x8rRyNthIg4h4cmjkWJZLrDRNyRyic4p7f11gO_ehSPU-hokgvjnAvBZ67mX8Ndvi3MyB-VHUt_15jAhDsD674gZbbWZxYkCWTHBJwKaJSJcYpMnBy307Nmy8dxyRqF_Quvk2_L0BdelbCsnC-idNyhMpqkVxsNdc28_ajV-N8eL6GFyr_EJ-WAWEZTLl9B81-qDq6i93sXyqWacDKIqwqrT7jo4FTl3TGhF1a5xRd9NfAZVTiEFuDTkr8eGkiL11L9zvOC-TlzgMEmVjhcCxRd-9ztORDTMO182i36a-jh_KzdbEUVC0NkGGPDCByUhlXUEsZpRxL4x4XWnNqEW54YyQyVVHHAVIm6FUaphDmiqZQK9GNN2TqazovcbSAsjXbUMGIcA92GUNxJp8GotWBDxpuIjEHJTFWi3DNlPGdhqxInWQAy80BmFZCb6HCi81IW6PhTetWDNJGs8Nn6_fM-mm2105vs5vL2ehvNgaFGGbC9g6aH_ZHbBX9kqPfCMPwAB-7cmw
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Reinforcement+Learning+of+Manipulation+and+Grasping+Using+Dynamical+Movement+Primitives+for+a+Humanoidlike+Mobile+Manipulator&rft.jtitle=IEEE%2FASME+transactions+on+mechatronics&rft.au=Li%2C+Zhijun&rft.au=Zhao%2C+Ting&rft.au=Chen%2C+Fei&rft.au=Hu%2C+Yingbai&rft.date=2018-02-01&rft.pub=IEEE&rft.issn=1083-4435&rft.volume=23&rft.issue=1&rft.spage=121&rft.epage=131&rft_id=info:doi/10.1109%2FTMECH.2017.2717461&rft.externalDocID=7953692
thumbnail_l http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=1083-4435&client=summon
thumbnail_m http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=1083-4435&client=summon
thumbnail_s http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=1083-4435&client=summon