Action Recognition Improved by Correlations and Attention of Subjects and Scene

Comprehensive activity understanding of multiple subjects in a video requires subject detection, action identification, and behavior interpretation as well as the interactions among subjects and background. This work develops the action recognition of subject(s) based on the correlations and interac...

Full description

Saved in:

Bibliographic Details
Published in	Visual communications and image processing (Online) pp. 1 - 5
Main Authors	Ha, Manh-Hung, Chen, Oscal Tzyh-Chiang
Format	Conference Proceeding
Language	English
Published	IEEE 05.12.2021
Subjects	action recognition convolutional neural network Correlation Deep learning Deep neural network Image recognition Law enforcement Neural networks spatiotemporal attention subject detection Three-dimensional displays transformer encoder Visual communication
Online Access	Get full text

Cover

Loading…

More Information
Summary:	Comprehensive activity understanding of multiple subjects in a video requires subject detection, action identification, and behavior interpretation as well as the interactions among subjects and background. This work develops the action recognition of subject(s) based on the correlations and interactions of the whole scene and subject(s) by using the Deep Neural Network (DNN). The proposed DNN consists of 3D Convolutional Neural Network (CNN), Spatial Attention (SA) generation layer, mapping convolutional fused-depth layer, Transformer Encoder (TE), and two fully connected layers with late fusion for final classification. Especially, the attention mechanisms in SA and TE are implemented to find out meaningful action information on spatial and temporal domains for enhancing recognition performance, respectively. The experimental results reveal that the proposed DNN shows the superior accuracies of 97.8%, 98.4% and 85.6% in the datasets of traffic police, UCF101-24 and JHMDB-21, respectively. Therefore, our DNN is an outstanding classifier for various action recognitions involving one or multiple subjects.
ISSN:	2642-9357
DOI:	10.1109/VCIP53242.2021.9675340