APEX: Unsupervised, Object-Centric Scene Segmentation and Tracking for Robot Manipulation
Recent advances in unsupervised learning for object detection, segmentation, and tracking hold significant promise for applications in robotics. A common approach is to frame these tasks as inference in probabilistic latent-variable models. In this paper, however, we show that the current state-of-t...
Saved in:
Published in | 2021 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) pp. 3375 - 3382 |
---|---|
Main Authors | , , , |
Format | Conference Proceeding |
Language | English |
Published |
IEEE
27.09.2021
|
Subjects | |
Online Access | Get full text |
Cover
Loading…
Abstract | Recent advances in unsupervised learning for object detection, segmentation, and tracking hold significant promise for applications in robotics. A common approach is to frame these tasks as inference in probabilistic latent-variable models. In this paper, however, we show that the current state-of-the-art struggles with visually complex scenes such as typically encountered in robot manipulation tasks. We propose APEX, a new latent-variable model which is able to segment and track objects in more realistic scenes featuring objects that vary widely in size and texture, including the robot arm itself. This is achieved by a principled mask normalisation algorithm and a high-resolution scene encoder. To evaluate our approach, we present results on the real-world Sketchy dataset. This dataset, however, does not contain ground truth masks and object IDs for a quantitative evaluation. We thus introduce the Panda Pushing Dataset (P2D) which shows a Panda arm interacting with objects on a table in simulation and which includes ground-truth segmentation masks and object IDs for tracking. In both cases, APEX comprehensively outperforms the current state-of-the-art in unsupervised object segmentation and tracking. We demonstrate the efficacy of our segmentations for robot skill execution on an object arrangement task, where we also achieve the best or comparable performance among all the baselines. |
---|---|
AbstractList | Recent advances in unsupervised learning for object detection, segmentation, and tracking hold significant promise for applications in robotics. A common approach is to frame these tasks as inference in probabilistic latent-variable models. In this paper, however, we show that the current state-of-the-art struggles with visually complex scenes such as typically encountered in robot manipulation tasks. We propose APEX, a new latent-variable model which is able to segment and track objects in more realistic scenes featuring objects that vary widely in size and texture, including the robot arm itself. This is achieved by a principled mask normalisation algorithm and a high-resolution scene encoder. To evaluate our approach, we present results on the real-world Sketchy dataset. This dataset, however, does not contain ground truth masks and object IDs for a quantitative evaluation. We thus introduce the Panda Pushing Dataset (P2D) which shows a Panda arm interacting with objects on a table in simulation and which includes ground-truth segmentation masks and object IDs for tracking. In both cases, APEX comprehensively outperforms the current state-of-the-art in unsupervised object segmentation and tracking. We demonstrate the efficacy of our segmentations for robot skill execution on an object arrangement task, where we also achieve the best or comparable performance among all the baselines. |
Author | Engelcke, Martin Posner, Ingmar Jones, Oiwi Parker Wu, Yizhe |
Author_xml | – sequence: 1 givenname: Yizhe surname: Wu fullname: Wu, Yizhe email: ywu@robots.ox.ac.uk organization: University of Oxford,Applied AI Lab, Oxford Robotics Institute – sequence: 2 givenname: Oiwi Parker surname: Jones fullname: Jones, Oiwi Parker organization: University of Oxford,Applied AI Lab, Oxford Robotics Institute – sequence: 3 givenname: Martin surname: Engelcke fullname: Engelcke, Martin organization: University of Oxford,Applied AI Lab, Oxford Robotics Institute – sequence: 4 givenname: Ingmar surname: Posner fullname: Posner, Ingmar organization: University of Oxford,Applied AI Lab, Oxford Robotics Institute |
BookMark | eNotkM1KAzEURqMo2NY-gSB5AKfmzp1kEnelVC1UKv0BXZVMcltS28wwMy349op29cHhcBZfl13FMhJj9yAGAMI8TuazhQRQepCKFAZGocoBLlgXlJIZ5JDJS9ZJQWIitFI3rN80OyEEiNxoozrsc_g-_njiq9gcK6pPoSH_wGfFjlybjCi2dXB84SgSX9D28AtsG8rIbfR8WVv3FeKWb8qaz8uibPmbjaE67v-cW3a9sfuG-uftsdXzeDl6Taazl8loOE1CKqFNHDqjc8y1EVoTaidASlKgPJnCOczIoNq4QntEianxWSZkJhFyL6xXEnvs7r8biGhd1eFg6-_1-Qn8AVglVBU |
ContentType | Conference Proceeding |
DBID | 6IE 6IH CBEJK RIE RIO |
DOI | 10.1109/IROS51168.2021.9636711 |
DatabaseName | IEEE Electronic Library (IEL) Conference Proceedings IEEE Proceedings Order Plan (POP) 1998-present by volume IEEE Xplore All Conference Proceedings IEEE IEEE Proceedings Order Plans (POP) 1998-present |
DatabaseTitleList | |
Database_xml | – sequence: 1 dbid: RIE name: IEEE Xplore url: https://proxy.k.utb.cz/login?url=https://ieeexplore.ieee.org/ sourceTypes: Publisher |
DeliveryMethod | fulltext_linktorsrc |
Discipline | Engineering |
EISBN | 1665417145 9781665417143 |
EISSN | 2153-0866 |
EndPage | 3382 |
ExternalDocumentID | 9636711 |
Genre | orig-research |
GroupedDBID | 6IE 6IF 6IH 6IL 6IN AAJGR ABLEC ACGFS ADZIZ ALMA_UNASSIGNED_HOLDINGS BEFXN BFFAM BGNUA BKEBE BPEOZ CBEJK CHZPO IEGSK IJVOP M43 OCL RIE RIL RIO RNS |
ID | FETCH-LOGICAL-i251t-c3c9873789088e38c0155e616de9bcc34e936fcb8d335329d440545317d0ad653 |
IEDL.DBID | RIE |
IngestDate | Wed Jun 26 19:25:55 EDT 2024 |
IsDoiOpenAccess | false |
IsOpenAccess | true |
IsPeerReviewed | false |
IsScholarly | true |
Language | English |
LinkModel | DirectLink |
MergedId | FETCHMERGED-LOGICAL-i251t-c3c9873789088e38c0155e616de9bcc34e936fcb8d335329d440545317d0ad653 |
OpenAccessLink | https://arxiv.org/pdf/2105.14895 |
PageCount | 8 |
ParticipantIDs | ieee_primary_9636711 |
PublicationCentury | 2000 |
PublicationDate | 2021-Sept.-27 |
PublicationDateYYYYMMDD | 2021-09-27 |
PublicationDate_xml | – month: 09 year: 2021 text: 2021-Sept.-27 day: 27 |
PublicationDecade | 2020 |
PublicationTitle | 2021 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) |
PublicationTitleAbbrev | IROS |
PublicationYear | 2021 |
Publisher | IEEE |
Publisher_xml | – name: IEEE |
SSID | ssj0001079896 |
Score | 2.2813785 |
Snippet | Recent advances in unsupervised learning for object detection, segmentation, and tracking hold significant promise for applications in robotics. A common... |
SourceID | ieee |
SourceType | Publisher |
StartPage | 3375 |
SubjectTerms | Manipulators Object detection Object segmentation Probabilistic logic Shape Task analysis Three-dimensional displays |
Title | APEX: Unsupervised, Object-Centric Scene Segmentation and Tracking for Robot Manipulation |
URI | https://ieeexplore.ieee.org/document/9636711 |
hasFullText | 1 |
inHoldings | 1 |
isFullTextHit | |
isPrint | |
link | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV3PT8IwFG6Qk178Acbf6cEjHRvd1tabMRg0QQhIgifSXzPEuBHZLv719o0Bajx4W5YsW_ravm9fv-89hK55EoB4xhA3nRISxlwSqXhIkhjOqJjSUoPfuf8U9ybh4zSa1lBr44Wx1pbiM-vBZXmWbzJdAFXWdpMlZmDk3eF-Z-XV2vIpPhNcxJUJOPBF-2E0GDs4EYOAqxN41cM_uqiUSeR-H_XXr19pR968Ilee_vxVmfG_33eAmlu7Hh5uEtEhqtn0CO19qzTYQC-3w-70Bk_SZbGA3WFpTQsPFJAwpCR45xqPtdv38Ni-vld-pBTL1GCXzTTw6djBWzzKVJbjvkzn675fTTS57z7f9UjVVYHMHZbJiaZacEbBAMu5pVwDarIuNMYKpTUNraBxohU3lEa0I0zoMF3oliozvjRxRI9RPc1Se4IwNZFKoIAYlSLUNpEmCRij7h9NShlF9BQ1YJBmi1XhjFk1Pmd_3z5HuxAoEGN02AWq5x-FvXQZP1dXZai_AFgNqpk |
link.rule.ids | 310,311,786,790,795,796,802,23958,23959,25170,27958,55109 |
linkProvider | IEEE |
linkToHtml | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV1JT8JAGJ0QPagXFzDuzsEjwzbtLN6MgYBSICwJnshsNcTYEmkv_npnSgE1Hrw1bZpM5pvO-_rmve8D4I6FdSee0cgupxB5hAkkJPNQSNwZFZVKKOd3DnqkPfGepv60AMobL4wxJhOfmYq7zM7ydaxSR5VV7WIh1Bl5dy3O1_jKrbVlVGqUM05yG7B9Xu0M-yObUBAn4WrUK_nrP_qoZDDSOgTBegAr9chbJU1kRX3-qs343xEegdLWsAcHGyg6BgUTnYCDb7UGi-DlYdCc3sNJtEwXbn9YGl2GfeloGJRRvHMFR8rufHBkXt9zR1IERaShxTPlGHVoE1w4jGWcwEBE83XnrxKYtJrjxzbK-yqguc1mEqSw4oxiZ4FlzGCmXN5kbHC04VIp7BmOSagk0xj7uMG1Z7M6z36sVNeEJj4-BTtRHJkzALH2ZehKiGHBPWVCocM6pdj-pQkhfB-fg6KbpNliVTpjls_Pxd-3b8Feexx0Z91O7_kS7LugOWlGg16BneQjNdcW_xN5k4X9C4jhre8 |
openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=proceeding&rft.title=2021+IEEE%2FRSJ+International+Conference+on+Intelligent+Robots+and+Systems+%28IROS%29&rft.atitle=APEX%3A+Unsupervised%2C+Object-Centric+Scene+Segmentation+and+Tracking+for+Robot+Manipulation&rft.au=Wu%2C+Yizhe&rft.au=Jones%2C+Oiwi+Parker&rft.au=Engelcke%2C+Martin&rft.au=Posner%2C+Ingmar&rft.date=2021-09-27&rft.pub=IEEE&rft.eissn=2153-0866&rft.spage=3375&rft.epage=3382&rft_id=info:doi/10.1109%2FIROS51168.2021.9636711&rft.externalDocID=9636711 |