On Mobile Pose Estimation and Action Recognition Design and Implementation

Human pose estimation (PE, tracking body pose on-the-go) is a computer vision-based technology that identifies and controls specific points on the human body. These points represent our joints and special points over the body determining the sizes, distances, angle of flexion, and type of the motion...

Full description

Saved in:
Bibliographic Details
Published inPattern recognition and image analysis Vol. 34; no. 1; pp. 126 - 136
Main Author Aslanyan, M.
Format Journal Article
LanguageEnglish
Published Moscow Pleiades Publishing 01.03.2024
Springer Nature B.V
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:Human pose estimation (PE, tracking body pose on-the-go) is a computer vision-based technology that identifies and controls specific points on the human body. These points represent our joints and special points over the body determining the sizes, distances, angle of flexion, and type of the motion. Knowing this in a specific exercise is the basis of work for rehabilitation and physiotherapy, fitness and self-coaching, augmented reality, animation and gaming, robot management, surveillance and human activity analysis. Implementing such capabilities may use special suits or sensor arrays to achieve the best result, but massive use of PE is related to devices that many users own: namely smartphones, smartwatches, and earbuds. The body pose estimation system starts with capturing the initial data. In dealing with motion detection, it is necessary to analyze a sequence of images rather than a still photo. Different software modules are responsible for tracking 2D key points, creating a body representation, and converting it into a 3D space. Action recognition on the other hand is a way to analyze the sequence of estimated pose data with the aim to categorize sequence under the classes. It is widely used various fields. One of the widely known use cases is analysis and detection of potential attacks of illegal action using video from the surveillance cameras. Another use case involves analysis of the sequence of pose with the aim of creating a virtual coaching environment. Specifically, our research will target this challenging issue and aim to create this environment for mobile devices. We will describe some of the solutions that are suitable for effectively pose estimation and action recognition on mobile devices. We will show how lightweight models based on convolution neural networks can be used to efficiently solve pose estimation issue and address action recognition problem with the dynamic time warping algorithm.
ISSN:1054-6618
1555-6212
DOI:10.1134/S1054661824010036