Egocentric Vision-based Action Recognition: A survey

[Display omitted] •A taxonomy to classify egocentric action recognition methods into categories.•The egocentric action recognition results are still far from being acceptable.•New features such as the use of sound are being leveraged in recent works.•Zero-, one-, and few-shot paradigms to adapt syst...

Full description

Saved in:

Bibliographic Details
Published in	Neurocomputing (Amsterdam) Vol. 472; pp. 175 - 197
Main Authors	Núñez-Marcos, Adrián, Azkune, Gorka, Arganda-Carreras, Ignacio
Format	Journal Article
Language	English
Published	Elsevier B.V 01.02.2022
Subjects	Computer vision Deep learning Egocentric vision Few-shot learning Human action recognition Deep learning Computer vision Human action recognition Egocentric vision Few-shot learning
Online Access	Get full text

Cover

Loading…

More Information
Summary:	[Display omitted] •A taxonomy to classify egocentric action recognition methods into categories.•The egocentric action recognition results are still far from being acceptable.•New features such as the use of sound are being leveraged in recent works.•Zero-, one-, and few-shot paradigms to adapt systems to the real-world.•Increasing the size and complexity of the datasets will help advance the research. The egocentric action recognition EAR field has recently increased its popularity due to the affordable and lightweight wearable cameras available nowadays such as GoPro and similars. Therefore, the amount of egocentric data generated has increased, triggering the interest in the understanding of egocentric videos. More specifically, the recognition of actions in egocentric videos has gained popularity due to the challenge that it poses: the wild movement of the camera and the lack of context make it hard to recognise actions with a performance similar to that of third-person vision solutions. This has ignited the research interest on the field and, nowadays, many public datasets and competitions can be found in both the machine learning and the computer vision communities. In this survey, we aim to analyse the literature on egocentric vision methods and algorithms. For that, we propose a taxonomy to divide the literature into various categories with subcategories, contributing a more fine-grained classification of the available methods. We also provide a review of the zero-shot approaches used by the EAR community, a methodology that could help to transfer EAR algorithms to real-world applications. Finally, we summarise the datasets used by researchers in the literature.
ISSN:	0925-2312 1872-8286
DOI:	10.1016/j.neucom.2021.11.081