Advancements in Human Action Recognition Through 5G/6G Technology for Smart Cities: Fuzzy Integral-Based Fusion

5-G/6G technology improves skeleton-based human action recognition (HAR) by delivering ultra-low latency and high data throughput for real-time and accurate security analysis of human actions. Despite its growing popularity, current HAR methods frequently fail to capture the skeleton sequence's...

Full description

Saved in:

Bibliographic Details
Published in	IEEE transactions on consumer electronics Vol. 70; no. 3; pp. 5783 - 5795
Main Authors	Mehmood, Faisal, Chen, Enqing, Azeem Akbar, Muhammad, Azam Zia, Muhammad, Alsanad, Ahmed, Abdullah Alhogail, Areej, Li, Yang
Format	Journal Article
Language	English
Published	New York IEEE 01.08.2024 The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Subjects	5G/6G technology Accuracy Artificial neural networks Convolutional neural networks Datasets Divergence Feature extraction fuzzy fusion human action recognition Human activity recognition Network latency Real time security Sentiment analysis Skeleton Smart cities Task analysis Three-dimensional displays Visual signals
Online Access	Get full text

Cover

Loading…

More Information
Summary:	5-G/6G technology improves skeleton-based human action recognition (HAR) by delivering ultra-low latency and high data throughput for real-time and accurate security analysis of human actions. Despite its growing popularity, current HAR methods frequently fail to capture the skeleton sequence's complexities. This study proposes a novel multimodal method that synergizes the Spatial-Temporal Attention LSTM (STA-LSTM) Network with the Convolutional Neural Network (CNN) to extract nuanced features from the skeleton sequence. The STA-LSTM network dives deep into inter- and intra-frame relations, while the CNN model uncovers geometric correlations within the human skeleton. Significantly, by integrating the Choquet fuzzy integral, we achieve a harmonized fusion of classifiers for each feature vector. Adopting Kullback Leibler and Jensen-Shannon divergences further ensures the complementary nature of these feature vectors. STA-LSTM Network and CNN in the proposed multimodal method significantly advance human action recognition. Impressive accuracy was demonstrated by our approach after evaluating benchmark skeletal datasets such as NTU-60, NTU-120, HDM05, and UT-DMHAD. Specifically, it achieved C-subject 90.75%, 84.50%, and C-setting 96.7% and 86.70% on NTU-60 and NTU-120, respectively. Furthermore, HDM05 and UT-DMHAD datasets recorded accuracies of 93.5% and 97.43%, indicating that our model outperforms current techniques and has excellent potential for sentiment analysis platforms that combine textual and visual signals.
Bibliography:	ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14
ISSN:	0098-3063 1558-4127
DOI:	10.1109/TCE.2024.3420936