Advancements in Human Action Recognition Through 5G/6G Technology for Smart Cities: Fuzzy Integral-Based Fusion

5-G/6G technology improves skeleton-based human action recognition (HAR) by delivering ultra-low latency and high data throughput for real-time and accurate security analysis of human actions. Despite its growing popularity, current HAR methods frequently fail to capture the skeleton sequence's...

Full description

Saved in:
Bibliographic Details
Published inIEEE transactions on consumer electronics Vol. 70; no. 3; pp. 5783 - 5795
Main Authors Mehmood, Faisal, Chen, Enqing, Azeem Akbar, Muhammad, Azam Zia, Muhammad, Alsanad, Ahmed, Abdullah Alhogail, Areej, Li, Yang
Format Journal Article
LanguageEnglish
Published New York IEEE 01.08.2024
The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:5-G/6G technology improves skeleton-based human action recognition (HAR) by delivering ultra-low latency and high data throughput for real-time and accurate security analysis of human actions. Despite its growing popularity, current HAR methods frequently fail to capture the skeleton sequence's complexities. This study proposes a novel multimodal method that synergizes the Spatial-Temporal Attention LSTM (STA-LSTM) Network with the Convolutional Neural Network (CNN) to extract nuanced features from the skeleton sequence. The STA-LSTM network dives deep into inter- and intra-frame relations, while the CNN model uncovers geometric correlations within the human skeleton. Significantly, by integrating the Choquet fuzzy integral, we achieve a harmonized fusion of classifiers for each feature vector. Adopting Kullback Leibler and Jensen-Shannon divergences further ensures the complementary nature of these feature vectors. STA-LSTM Network and CNN in the proposed multimodal method significantly advance human action recognition. Impressive accuracy was demonstrated by our approach after evaluating benchmark skeletal datasets such as NTU-60, NTU-120, HDM05, and UT-DMHAD. Specifically, it achieved C-subject 90.75%, 84.50%, and C-setting 96.7% and 86.70% on NTU-60 and NTU-120, respectively. Furthermore, HDM05 and UT-DMHAD datasets recorded accuracies of 93.5% and 97.43%, indicating that our model outperforms current techniques and has excellent potential for sentiment analysis platforms that combine textual and visual signals.
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
ISSN:0098-3063
1558-4127
DOI:10.1109/TCE.2024.3420936