An optimization method of human skeleton keyframes selection for action recognition

In the action recognition field based on the characteristics of human skeleton joint points, the selection of keyframes in the skeleton sequence is a significant issue, which directly affects the action recognition accuracy. In order to improve the effectiveness of keyframes selection, this paper pr...

Full description

Saved in:

Bibliographic Details
Published in	Complex & intelligent systems Vol. 10; no. 4; pp. 4659 - 4673
Main Authors	Chen, Hao, Pan, Yuekai, Wang, Chenwu
Format	Journal Article
Language	English
Published	Cham Springer International Publishing 01.08.2024 Springer Nature B.V Springer
Subjects	Accuracy Activity recognition Body parts Complexity Computational Intelligence Data Structures and Information Theory Datasets Effectiveness Engineering Feature extraction Human action recognition Inflection point frame Keyframes selection Machine learning Multi-objective optimization Multiple objective analysis Optimization Optimization models Original Article Human action recognition Multi-objective optimization Keyframes selection Inflection point frame
Online Access	Get full text

Cover

Loading…

More Information
Summary:	In the action recognition field based on the characteristics of human skeleton joint points, the selection of keyframes in the skeleton sequence is a significant issue, which directly affects the action recognition accuracy. In order to improve the effectiveness of keyframes selection, this paper proposes inflection point frames, and transforms keyframes selection into a multi-objective optimization problem based on it. First, the pose features are extracted from the input skeleton joint point data, which used to construct the pose feature vector of each frame in time sequence; then, the inflection point frames in the sequence are determined according to the flow of momentum of each body part. Next, the pose feature vectors are input into the keyframes multi-objective optimization model, with the fusion of domain information and the number of keyframes; finally, the output keyframes are input to the action classifier. To verify the effectiveness of the method, the MSR-Action3D, the UTKinect-Action and Florence3D-Action, and the 3 public datasets, are chosen for simulation experiments and the results show that the keyframes sequence obtained by this method can significantly improve the accuracy of multiple action classifiers, and the average recognition accuracy of the three data sets can reach 94.6%, 97.6% and 94.2% respectively. Besides, combining the optimized keyframes with deep learning classifier on the NTU RGB + D dataset can make the accuracies reaching 83.2% and 93.7%.
ISSN:	2199-4536 2198-6053
DOI:	10.1007/s40747-024-01403-5