Alternative Semantic Representations for Zero-Shot Human Action Recognition

A proper semantic representation for encoding side information is key to the success of zero-shot learning. In this paper, we explore two alternative semantic representations especially for zero-shot human action recognition: textual descriptions of human actions and deep features extracted from sti...

Full description

Saved in:

Bibliographic Details
Published in	Machine Learning and Knowledge Discovery in Databases Vol. 10534; pp. 87 - 102
Main Authors	Wang, Qian, Chen, Ke
Format	Book Chapter
Language	English
Published	Switzerland Springer International Publishing AG 2017 Springer International Publishing
Series	Lecture Notes in Computer Science
Subjects	Fisher Vector Human action recognition Image deep representation Semantic representation Textual description representation Zero-shot learning
Online Access	Get full text

Cover

Loading…

More Information
Summary:	A proper semantic representation for encoding side information is key to the success of zero-shot learning. In this paper, we explore two alternative semantic representations especially for zero-shot human action recognition: textual descriptions of human actions and deep features extracted from still images relevant to human actions. Such side information are accessible on Web with little cost, which paves a new way in gaining side information for large-scale zero-shot human action recognition. We investigate different encoding methods to generate semantic representations for human actions from such side information. Based on our zero-shot visual recognition method, we conducted experiments on UCF101 and HMDB51 to evaluate two proposed semantic representations. The results suggest that our proposed text- and image-based semantic representations outperform traditional attributes and word vectors considerably for zero-shot human action recognition. In particular, the image-based semantic representations yield the favourable performance even though the representation is extracted from a small number of images per class. Code related to this chapter is available at: http://staff.cs.manchester.ac.uk/~kechen/BiDiLEL/ Data related to this chapter are available at: http://staff.cs.manchester.ac.uk/~kechen/ASRHAR/
ISBN:	3319712489 9783319712482
ISSN:	0302-9743 1611-3349
DOI:	10.1007/978-3-319-71249-9_6