Video Based Action Recognition Using Spatial and Temporal Feature

The recognition of actions from video sequences has many applications such as monitoring, assisted living, surveillance, and smart homes. Despite advances in deep learning method, the methodologies to process the video data are still subject to research for that temporal information extraction is st...

Full description

Saved in:
Bibliographic Details
Published in2018 IEEE International Conference on Internet of Things (iThings) and IEEE Green Computing and Communications (GreenCom) and IEEE Cyber, Physical and Social Computing (CPSCom) and IEEE Smart Data (SmartData) pp. 635 - 638
Main Authors Dai, Cheng, Liu, Xingang, Zhong, Luhao, Yu, Tao
Format Conference Proceeding
LanguageEnglish
Published IEEE 01.07.2018
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:The recognition of actions from video sequences has many applications such as monitoring, assisted living, surveillance, and smart homes. Despite advances in deep learning method, the methodologies to process the video data are still subject to research for that temporal information extraction is still a challenge. In this work, we propose a double stream human action recognition architecture combining both spatial feature stream and temporal feature stream, which provides spatial and temporal feature for the video based action recognition. For the spatial stream, the individual video frames are extracted as the input, while optical flow images were extracted and sent to the deep learning network as input for temporal feature learning. In the experiment, we experimented our proposal on the KTH database and achieved superior results compared the traditional methods. To further improve the recognition accuracy, we experimented fine-tuning mechanism to optimize deep learning network parameters. Furthermore, we introduced the linear SVM to replace softmax classifier to classify the comprehensive feature.
DOI:10.1109/Cybermatics_2018.2018.00129