Egocentric Vision-based Action Recognition: A survey

[Display omitted] •A taxonomy to classify egocentric action recognition methods into categories.•The egocentric action recognition results are still far from being acceptable.•New features such as the use of sound are being leveraged in recent works.•Zero-, one-, and few-shot paradigms to adapt syst...

Full description

Saved in:
Bibliographic Details
Published inNeurocomputing (Amsterdam) Vol. 472; pp. 175 - 197
Main Authors Núñez-Marcos, Adrián, Azkune, Gorka, Arganda-Carreras, Ignacio
Format Journal Article
LanguageEnglish
Published Elsevier B.V 01.02.2022
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:[Display omitted] •A taxonomy to classify egocentric action recognition methods into categories.•The egocentric action recognition results are still far from being acceptable.•New features such as the use of sound are being leveraged in recent works.•Zero-, one-, and few-shot paradigms to adapt systems to the real-world.•Increasing the size and complexity of the datasets will help advance the research. The egocentric action recognition EAR field has recently increased its popularity due to the affordable and lightweight wearable cameras available nowadays such as GoPro and similars. Therefore, the amount of egocentric data generated has increased, triggering the interest in the understanding of egocentric videos. More specifically, the recognition of actions in egocentric videos has gained popularity due to the challenge that it poses: the wild movement of the camera and the lack of context make it hard to recognise actions with a performance similar to that of third-person vision solutions. This has ignited the research interest on the field and, nowadays, many public datasets and competitions can be found in both the machine learning and the computer vision communities. In this survey, we aim to analyse the literature on egocentric vision methods and algorithms. For that, we propose a taxonomy to divide the literature into various categories with subcategories, contributing a more fine-grained classification of the available methods. We also provide a review of the zero-shot approaches used by the EAR community, a methodology that could help to transfer EAR algorithms to real-world applications. Finally, we summarise the datasets used by researchers in the literature.
ISSN:0925-2312
1872-8286
DOI:10.1016/j.neucom.2021.11.081