Hand-crafted and deep convolutional neural network features fusion and selection strategy: An application to intelligent human action recognition
Human action recognition (HAR) has gained much attention in the last few years due to its enormous applications including human activity monitoring, robotics, visual surveillance, to name but a few. Most of the previously proposed HAR systems have focused on using hand-crafted images features. Howev...
Saved in:
Published in | Applied soft computing Vol. 87; p. 105986 |
---|---|
Main Authors | , , , , , |
Format | Journal Article |
Language | English |
Published |
Elsevier B.V
01.02.2020
|
Subjects | |
Online Access | Get full text |
Cover
Loading…
Summary: | Human action recognition (HAR) has gained much attention in the last few years due to its enormous applications including human activity monitoring, robotics, visual surveillance, to name but a few. Most of the previously proposed HAR systems have focused on using hand-crafted images features. However, these features cover limited aspects of the problem and show performance degradation on a large and complex datasets. Therefore, in this work, we propose a novel HAR system which is based on the fusion of conventional hand-crafted features using histogram of oriented gradients (HoG) and deep features. Initially, human silhouette is extracted with the help of saliency-based method - implemented in two phases. In the first phase, motion and geometric features are extracted from the selected channel, whilst, second phase calculates the Chi-square distance between the extracted and threshold-based minimum distance features. Afterwards, extracted deep CNN and hand-crafted features are fused to generate a resultant vector. Moreover, to cope with the curse of dimensionality, an entropy-based feature selection technique is also proposed to identify the most discriminant features for classification using multi-class support vector machine (M-SVM). All the simulations are performed on five publicly available benchmark datasets including Weizmann, UCF11 (YouTube), UCF Sports, IXMAS, and UT-Interaction. A comparative evaluation is also presented to show that our proposed model achieves superior performances in comparison to a few exiting methods.
•Motion and Geometric features are extracted for human flow estimation and silhouette extraction.•Deep CNN and hand crafted features are fused through parallel approach.•Entropy-controlled Chi-square approach is proposed for best features selection.•Experiments are performed on several well-known datasets. |
---|---|
ISSN: | 1568-4946 1872-9681 |
DOI: | 10.1016/j.asoc.2019.105986 |