Dual relation network for temporal action localization

Temporal action localization is a challenging task for video understanding. Most previous methods process each proposal independently and neglect the reasoning of proposal-proposal and proposal-context relations. We argue that the supplementary information obtained by exploiting these relations can...

Full description

Saved in:

Bibliographic Details
Published in	Pattern recognition Vol. 129; p. 108725
Main Authors	Xia, Kun, Wang, Le, Zhou, Sanping, Hua, Gang, Tang, Wei
Format	Journal Article
Language	English
Published	Elsevier Ltd 01.09.2022
Subjects	Relation reasoning Temporal action localization Temporal action localization Relation reasoning
Online Access	Get full text

Cover

Loading…

More Information
Summary:	Temporal action localization is a challenging task for video understanding. Most previous methods process each proposal independently and neglect the reasoning of proposal-proposal and proposal-context relations. We argue that the supplementary information obtained by exploiting these relations can enhance the proposal representation and further boost the action localization. To this end, we propose a dual relation network to model both proposal-proposal and proposal-context relations. Concretely, a proposal-proposal relation module is leveraged to learn discriminative supplementary information from relevant proposals, which allows the network to model their interaction based on appearance and geometric similarities. Meanwhile, a proposal-context relation module is employed to mine contextual clues by adaptively learning from the global context outside of region-based proposals. They effectively leverage the inherent correlation between actions and the long-term dependency with videos for high-quality proposal refinement. As a result, the proposed framework enables the model to distinguish similar action instances and locate temporal boundaries more precisely. Extensive experiments on the THUMOS14 dataset and ActivityNet v1.3 dataset demonstrate that the proposed method significantly outperforms recent state-of-the-art methods.
ISSN:	0031-3203 1873-5142
DOI:	10.1016/j.patcog.2022.108725