Hand-crafted and deep convolutional neural network features fusion and selection strategy: An application to intelligent human action recognition

Human action recognition (HAR) has gained much attention in the last few years due to its enormous applications including human activity monitoring, robotics, visual surveillance, to name but a few. Most of the previously proposed HAR systems have focused on using hand-crafted images features. Howev...

Full description

Saved in:

Bibliographic Details
Published in	Applied soft computing Vol. 87; p. 105986
Main Authors	Khan, Muhammad Attique, Sharif, Muhammad, Akram, Tallha, Raza, Mudassar, Saba, Tanzila, Rehman, Amjad
Format	Journal Article
Language	English
Published	Elsevier B.V 01.02.2020
Subjects	Deep CNN features Feature selection Features fusion HAR Recognition Shape features Silhouette extraction Feature selection Deep CNN features Features fusion HAR Shape features Silhouette extraction Recognition
Online Access	Get full text

Cover

Loading…

More Information
Summary:	Human action recognition (HAR) has gained much attention in the last few years due to its enormous applications including human activity monitoring, robotics, visual surveillance, to name but a few. Most of the previously proposed HAR systems have focused on using hand-crafted images features. However, these features cover limited aspects of the problem and show performance degradation on a large and complex datasets. Therefore, in this work, we propose a novel HAR system which is based on the fusion of conventional hand-crafted features using histogram of oriented gradients (HoG) and deep features. Initially, human silhouette is extracted with the help of saliency-based method - implemented in two phases. In the first phase, motion and geometric features are extracted from the selected channel, whilst, second phase calculates the Chi-square distance between the extracted and threshold-based minimum distance features. Afterwards, extracted deep CNN and hand-crafted features are fused to generate a resultant vector. Moreover, to cope with the curse of dimensionality, an entropy-based feature selection technique is also proposed to identify the most discriminant features for classification using multi-class support vector machine (M-SVM). All the simulations are performed on five publicly available benchmark datasets including Weizmann, UCF11 (YouTube), UCF Sports, IXMAS, and UT-Interaction. A comparative evaluation is also presented to show that our proposed model achieves superior performances in comparison to a few exiting methods. •Motion and Geometric features are extracted for human flow estimation and silhouette extraction.•Deep CNN and hand crafted features are fused through parallel approach.•Entropy-controlled Chi-square approach is proposed for best features selection.•Experiments are performed on several well-known datasets.
ISSN:	1568-4946 1872-9681
DOI:	10.1016/j.asoc.2019.105986