IPGN: Interactiveness Proposal Graph Network for Human-Object Interaction Detection

Human-Object Interaction (HOI) Detection is an important task to understand how humans interact with objects. Most of the existing works treat this task as an exhaustive triplet <inline-formula> <tex-math notation="LaTeX">\left \langle{ human, verb, object }\right \rangle </...

Full description

Saved in:

Bibliographic Details
Published in	IEEE transactions on image processing Vol. 30; pp. 6583 - 6593
Main Authors	Wang, Haoran, Jiao, Licheng, Liu, Fang, Li, Lingling, Liu, Xu, Ji, Deyi, Gan, Weihao
Format	Journal Article
Language	English
Published	New York IEEE 2021 The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Subjects	Human-object interaction detection interaction learning interactiveness proposal Knowledge engineering Learning Message passing Pose estimation Proposals Semantics Task analysis two-stage graph model Visualization
Online Access	Get full text

Cover

Loading…

More Information
Summary:	Human-Object Interaction (HOI) Detection is an important task to understand how humans interact with objects. Most of the existing works treat this task as an exhaustive triplet <inline-formula> <tex-math notation="LaTeX">\left \langle{ human, verb, object }\right \rangle </tex-math></inline-formula> classification problem. In this paper, we decompose it and propose a novel two-stage graph model to learn the knowledge of interactiveness and interaction in one network, namely, Interactiveness Proposal Graph Network (IPGN). In the first stage, we design a fully connected graph for learning the interactiveness, which distinguishes whether a pair of human and object is interactive or not. Concretely, it generates the interactiveness features to encode high-level semantic interactiveness knowledge for each pair. The class-agnostic interactiveness is a more general and simpler objective, which can be used to provide reasonable proposals for the graph construction in the second stage. In the second stage, a sparsely connected graph is constructed with all interactive pairs selected by the first stage. Specifically, we use the interactiveness knowledge to guide the message passing. By contrast with the feature similarity, it explicitly represents the connections between the nodes. Benefiting from the valid graph reasoning, the node features are well encoded for interaction learning. Experiments show that the proposed method achieves state-of-the-art performance on both V-COCO and HICO-DET datasets.
Bibliography:	ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 23
ISSN:	1057-7149 1941-0042
DOI:	10.1109/TIP.2021.3096333