Together Recognizing, Localizing and Summarizing Actions in Egocentric Videos

Analysis of egocentric video has recently drawn attention of researchers in the computer vision as well as multimedia communities. In this paper, we propose a weakly supervised superpixel level joint framework for localization, recognition and summarization of actions in an egocentric video. We firs...

Full description

Saved in:
Bibliographic Details
Published inIEEE transactions on image processing Vol. 30; pp. 4330 - 4340
Main Authors Sahu, Abhimanyu, Chowdhury, Ananda S.
Format Journal Article
LanguageEnglish
Published United States IEEE 2021
The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:Analysis of egocentric video has recently drawn attention of researchers in the computer vision as well as multimedia communities. In this paper, we propose a weakly supervised superpixel level joint framework for localization, recognition and summarization of actions in an egocentric video. We first recognize and localize single as well as multiple action(s) in each frame of an egocentric video and then construct a summary of these detected actions. The superpixel level solution helps in precise localization of actions in addition to improving the recognition accuracy. Superpixels are extracted within the central regions of the egocentric video frames; these central regions being determined through a previously developed center-surround model. A sparse spatio-temporal video representation graph is constructed in the deep feature space with the superpixels as nodes. A weakly supervised solution using random walks yields action labels for each superpixel. After determining action label(s) for each frame from its constituent superpixels, we apply a fractional knapsack type formulation for obtaining a summary (of actions). Experimental comparisons on publicly available ADL, GTEA, EGTEA Gaze+, EgoGesture, and EPIC-Kitchens datasets show the effectiveness of the proposed solution.
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
content type line 23
ISSN:1057-7149
1941-0042
1941-0042
DOI:10.1109/TIP.2021.3070732