Compositional Learning of Human Activities With a Self-Organizing Neural Architecture

An important step for assistive systems and robot companions operating in human environments is to learn the compositionality of human activities, i.e., recognize both activities and their comprising actions. Most existing approaches address action and activity recognition as separate tasks, i.e., a...

Full description

Saved in:

Bibliographic Details
Published in	Frontiers in robotics and AI Vol. 6; p. 72
Main Authors	Mici, Luiza, Parisi, German I, Wermter, Stefan
Format	Journal Article
Language	English
Published	Switzerland Frontiers Media S.A 27.08.2019
Subjects	compositionality of human activities hierarchical learning human activity recognition RGB-D perception Robotics and AI self-organizing networks compositionality of human activities self-organizing networks RGB-D perception hierarchical learning human activity recognition
Online Access	Get full text

Cover

Loading…

More Information
Summary:	An important step for assistive systems and robot companions operating in human environments is to learn the compositionality of human activities, i.e., recognize both activities and their comprising actions. Most existing approaches address action and activity recognition as separate tasks, i.e., actions need to be inferred before the activity labels, and are thus highly sensitive to the correct temporal segmentation of the activity sequences. In this paper, we present a novel learning approach that jointly learns human activities on two levels of semantic and temporal complexity: (1) transitive actions such as and , e.g., a cereal box, and (2) high-level activities such as . Our model consists of a hierarchy of GWR networks which process and learn inherent spatiotemporal dependencies of multiple visual cues extracted from the human body skeletal representation and the interaction with objects. The neural architecture learns and semantically segments input RGB-D sequences of high-level activities into their composing actions, without supervision. We investigate the performance of our architecture with a set of experiments on a publicly available benchmark dataset. The experimental results show that our approach outperforms the state of the art with respect to the classification of the high-level activities. Additionally, we introduce a novel top-down modulation mechanism to the architecture which uses the actions and activity labels as constraints during the learning phase. In our experiments, we show how this mechanism can be used to control the network's neural growth without decreasing the overall performance.
Bibliography:	ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 23 This article was submitted to Robotic Control Systems, a section of the journal Frontiers in Robotics and AI Reviewed by: Heng Liu, Huainan Normal University, China; Jose De Jesus Rubio, National Polytechnic Institute, Mexico Edited by: Yongping Pan, National University of Singapore, Singapore
ISSN:	2296-9144 2296-9144
DOI:	10.3389/frobt.2019.00072