DKD–DAD: a novel framework with discriminative kinematic descriptor and deep attention-pooled descriptor for action recognition

In order to improve action recognition accuracy, the discriminative kinematic descriptor and deep attention-pooled descriptor are proposed. Firstly, the optical flow field is transformed into a set of kinematic fields with more discriminativeness. Subsequently, two kinematic features are constructed...

Full description

Saved in:

Bibliographic Details
Published in	Neural computing & applications Vol. 32; no. 9; pp. 5285 - 5302
Main Authors	Tong, Ming, Li, Mingyang, Bai, He, Ma, Lei, Zhao, Mengao
Format	Journal Article
Language	English
Published	London Springer London 01.05.2020 Springer Nature B.V
Subjects	Artificial Intelligence Computational Biology/Bioinformatics Computational Science and Engineering Computer Science Confusion Data Mining and Knowledge Discovery Dynamic characteristics Image Processing and Computer Vision Kinematics Optical flow (image analysis) Original Article Outliers (statistics) Probability and Statistics in Computer Science Recognition Action recognition Deep learning Attention mechanism Kinematic feature
Online Access	Get full text
ISSN	0941-0643 1433-3058
DOI	10.1007/s00521-019-04030-1

Cover

More Information
Summary:	In order to improve action recognition accuracy, the discriminative kinematic descriptor and deep attention-pooled descriptor are proposed. Firstly, the optical flow field is transformed into a set of kinematic fields with more discriminativeness. Subsequently, two kinematic features are constructed, which more accurately depict the dynamic characteristics of action subject from the multi-order divergence and curl fields. Secondly, by introducing both of the tight-loose constraint and anti-confusion constraint, a discriminative fusion method is proposed, which guarantees better within-class compactness and between-class separability, meanwhile reduces the confusion caused by outliers. Furthermore, a discriminative kinematic descriptor is constructed. Thirdly, a prediction-attentional pooling method is proposed, which accurately focuses its attention on the discriminative local regions. On this basis, a deep attention-pooled descriptor (DKD–DAD) is constructed. Finally, a novel framework with discriminative kinematic descriptor and deep attention-pooled descriptor is presented, which comprehensively obtains the discriminative dynamic and static information in a video. Consequently, accuracies are improved. Experiments on two challenging datasets verify the effectiveness of our methods.
Bibliography:	ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14
ISSN:	0941-0643 1433-3058
DOI:	10.1007/s00521-019-04030-1