Action Quality Assessment using Transformers

Action quality assessment (AQA) is an active research problem in video-based applications that is a challenging task due to the score variance per frame. Existing methods address this problem via convolutional-based approaches but suffer from its limitation of effectively capturing long-range depend...

Full description

Saved in:
Bibliographic Details
Published inarXiv.org
Main Authors Iyer, Abhay, Alali, Mohammad, Bodala, Hemanth, Vaidya, Sunit
Format Paper
LanguageEnglish
Published Ithaca Cornell University Library, arXiv.org 20.07.2022
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:Action quality assessment (AQA) is an active research problem in video-based applications that is a challenging task due to the score variance per frame. Existing methods address this problem via convolutional-based approaches but suffer from its limitation of effectively capturing long-range dependencies. With the recent advancements in Transformers, we show that they are a suitable alternative to the conventional convolutional-based architectures. Specifically, can transformer-based models solve the task of AQA by effectively capturing long-range dependencies, parallelizing computation, and providing a wider receptive field for diving videos? To demonstrate the effectiveness of our proposed architectures, we conducted comprehensive experiments and achieved a competitive Spearman correlation score of 0.9317. Additionally, we explore the hyperparameters effect on the model's performance and pave a new path for exploiting Transformers in AQA.
ISSN:2331-8422