Relative view based holistic-separate representations for two-person interaction recognition using multiple graph convolutional networks

•Our method with efficient view transformation scheme can get superior performance .•Proposed representations consider the whole interaction and motions of each person.•Results on two largest skeletal interaction datasets demonstrate the superiority. In this paper, we focus on recognizing person-per...

Full description

Saved in:
Bibliographic Details
Published inJournal of visual communication and image representation Vol. 70; p. 102833
Main Authors Liu, Xing, Li, Yanshan, Guo, Tianyu, Xia, Rongjie
Format Journal Article
LanguageEnglish
Published Elsevier Inc 01.07.2020
Subjects
Online AccessGet full text
ISSN1047-3203
1095-9076
DOI10.1016/j.jvcir.2020.102833

Cover

Loading…
More Information
Summary:•Our method with efficient view transformation scheme can get superior performance .•Proposed representations consider the whole interaction and motions of each person.•Results on two largest skeletal interaction datasets demonstrate the superiority. In this paper, we focus on recognizing person-person interactions using skeletal data captured from depth sensors. First, we propose a novel and efficient view transformation scheme. The skeletal interaction sequence is re-observed under a new coordinate system, which is invariant to various setups and capturing views of depth cameras as well as the position or facing orientation exchange between two persons. Second, we propose concise and discriminative interaction representations simply composed of the joint locations from two persons. Proposed representations are efficient to describe both the holistic interactive scene and individual poses performed by each subject separately. Third, we introduce the graph convolutional networks(GCN) to directly learn proposed skeletal interaction representations. Moreover, we design a multiple GCN-based model to provide the final class score. Extensive experimental results on three skeletal action datasets NTU RGB+D 60, NTU RGB+D 120 and SBU consistently demonstrate the superiority of our interaction recognition method.
ISSN:1047-3203
1095-9076
DOI:10.1016/j.jvcir.2020.102833