Depth-based human action recognition using histogram of templates

In this paper, we propose an efficient, fast, and easy-to-implement method for recognizing human actions in depth image sequences. In this method, the human body silhouettes are initially extracted from the depth image sequences using the Gaussian mixture background subtraction model. After removing...

Full description

Saved in:
Bibliographic Details
Published inMultimedia tools and applications Vol. 83; no. 14; pp. 40415 - 40449
Main Authors Younsi, Merzouk, Yesli, Samir, Diaf, Moussa
Format Journal Article
LanguageEnglish
Published New York Springer US 01.04.2024
Springer Nature B.V
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:In this paper, we propose an efficient, fast, and easy-to-implement method for recognizing human actions in depth image sequences. In this method, the human body silhouettes are initially extracted from the depth image sequences using the Gaussian mixture background subtraction model. After removing noise from the foreground image by performing a cascade of morphological operations and area filtering, the contour of the human silhouette is extracted by applying Moore’s neighbor contour tracing algorithm. From this contour, features describing the human posture are calculated using the Histogram of Templates (HoT) descriptor. These features are then used to train a Dendogram-based support vector machine for generating the frame-by-frame posture variation signal of the action sequence. The histogram of this signal is created, and finally introduced as an input vector into a Fuzzy k Nearest Neighbor (F k NN) classifier for recognizing human actions. The proposed method is evaluated on two publicly available datasets containing various daily actions (Bending, Sitting, Lying, etc.) performed by different human subjects. Extensive experiments are conducted using several values of the nearest neighbor ( k ) in the F k NN and different similarity measures, namely Euclidean distance, Bhattacharyya distance, Kullback–Leibler distance, and histogram intersection-based distance. The results show that the proposed method performs better or comparable to other state-of-the-art approaches. Moreover, this method can process 18 frames per second from the image sequence, which makes it well suited for applications needing real-time human action recognition.
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
ISSN:1573-7721
1380-7501
1573-7721
DOI:10.1007/s11042-023-16989-0