Structured Images for RGB-D Action Recognition

This paper presents an effective yet simple video representation for RGB-D based action recognition. It proposes to represent a depth map sequence into three pairs of structured dynamic images at body, part and joint levels respectively through bidirectional rank pooling. Different from previous wor...

Full description

Saved in:

Bibliographic Details
Published in	2017 IEEE International Conference on Computer Vision Workshops (ICCVW) pp. 1005 - 1014
Main Authors	Pichao Wang, Shuang Wang, Zhimin Gao, Yonghong Hou, Wanqing Li
Format	Conference Proceeding
Language	English
Published	IEEE 01.10.2017
Subjects	Aggregates Benchmark testing Dynamics Image recognition Periodic structures Skeleton Three-dimensional displays
Online Access	Get full text

Cover

Loading…

Abstract	This paper presents an effective yet simple video representation for RGB-D based action recognition. It proposes to represent a depth map sequence into three pairs of structured dynamic images at body, part and joint levels respectively through bidirectional rank pooling. Different from previous works that applied one Convolutional Neural Network (ConvNet) for each part/joint separately, one pair of structured dynamic images is constructed from depth maps at each granularity level and serves as the input of a ConvNet. The structured dynamic image not only preserves the spatial-temporal information but also enhances the structure information across both body parts/joints and different temporal scales. In addition, it requires low computational cost and memory to construct. This new representation, referred to as Spatially Structured Dynamic Depth Images (S 2 DDI), aggregates from global to fine-grained levels motion and structure information in a depth sequence, and enables us to fine-tune the existing ConvNet models trained on image data for classification of depth sequences, without a need for training the models afresh. The proposed representation is evaluated on five benchmark datasets, namely, MSRAction3D, G3D, MSRDailyActivity3D, SYSU 3D HOI and UTD-MHAD datasets and achieves the state-of-the-art results on all five datasets.
AbstractList	This paper presents an effective yet simple video representation for RGB-D based action recognition. It proposes to represent a depth map sequence into three pairs of structured dynamic images at body, part and joint levels respectively through bidirectional rank pooling. Different from previous works that applied one Convolutional Neural Network (ConvNet) for each part/joint separately, one pair of structured dynamic images is constructed from depth maps at each granularity level and serves as the input of a ConvNet. The structured dynamic image not only preserves the spatial-temporal information but also enhances the structure information across both body parts/joints and different temporal scales. In addition, it requires low computational cost and memory to construct. This new representation, referred to as Spatially Structured Dynamic Depth Images (S 2 DDI), aggregates from global to fine-grained levels motion and structure information in a depth sequence, and enables us to fine-tune the existing ConvNet models trained on image data for classification of depth sequences, without a need for training the models afresh. The proposed representation is evaluated on five benchmark datasets, namely, MSRAction3D, G3D, MSRDailyActivity3D, SYSU 3D HOI and UTD-MHAD datasets and achieves the state-of-the-art results on all five datasets.
Author	Zhimin Gao Shuang Wang Yonghong Hou Pichao Wang Wanqing Li
Author_xml	– sequence: 1 surname: Pichao Wang fullname: Pichao Wang email: pw212@uowmail.edu.au organization: Adv. Multimedia Res. Lab., Univ. of Wollongong, Wollongong, NSW, Australia – sequence: 2 surname: Shuang Wang fullname: Shuang Wang email: wangshuang1993@tju.edu.cn organization: Sch. of Electron. Inf. Eng., Tianjin Univ., Tianjin, China – sequence: 3 surname: Zhimin Gao fullname: Zhimin Gao email: zg126@uowmail.edu.au organization: Adv. Multimedia Res. Lab., Univ. of Wollongong, Wollongong, NSW, Australia – sequence: 4 surname: Yonghong Hou fullname: Yonghong Hou email: houroy@tju.edu.cn organization: Sch. of Electron. Inf. Eng., Tianjin Univ., Tianjin, China – sequence: 5 surname: Wanqing Li fullname: Wanqing Li email: wanqing@uow.edu.au organization: Adv. Multimedia Res. Lab., Univ. of Wollongong, Wollongong, NSW, Australia
BookMark	eNotjrtOwzAUQA0qEm3pysKSH3Dw9b1-jSVAiVSpUnmNlWM7VRBNUJIO_D1FMJ0zHZ0Zm7Rdmxi7BpEDCHdbFsXbey4FmBwknrEZKLQaBJI6Z1NJBrlzRJdsMQwfQgjQoByKKcufx_4YxmOfYlYe_D4NWd312XZ1x--zZRibrs22KXT7tvn1K3ZR-88hLf45Z6-PDy_FE19vVmWxXPMGjBq5IoqxkjqCV9FFX1WOkjWWHJkKMWAKlTOnBQNEqIV2dYj2dB7B1koFnLObv26TUtp99c3B9987K7VClPgDolFC3Q
ContentType	Conference Proceeding
DBID	6IE 6IL CBEJK RIE RIL
DOI	10.1109/ICCVW.2017.123
DatabaseName	IEEE Electronic Library (IEL) Conference Proceedings IEEE Proceedings Order Plan All Online (POP All Online) 1998-present by volume IEEE Xplore All Conference Proceedings IEEE Electronic Library Online IEEE Proceedings Order Plans (POP All) 1998-Present
DatabaseTitleList
Database_xml	– sequence: 1 dbid: RIE name: IEEE Electronic Library Online url: https://proxy.k.utb.cz/login?url=https://ieeexplore.ieee.org/ sourceTypes: Publisher
DeliveryMethod	fulltext_linktorsrc
Discipline	Applied Sciences
EISBN	1538610345 9781538610343
EISSN	2473-9944
EndPage	1014
ExternalDocumentID	8265332
Genre	orig-research
GroupedDBID	6IE 6IF 6IH 6IK 6IL 6IM 6IN AAJGR ABLEC ADZIZ ALMA_UNASSIGNED_HOLDINGS BEFXN BFFAM BGNUA BKEBE BPEOZ CBEJK CHZPO IEGSK IPLJI OCL RIE RIL RNS
ID	FETCH-LOGICAL-i175t-544ddb26d1a5d9dabb94e8784947b33c3ecb97159714436069fcd8123d18f55c3
IEDL.DBID	RIE
IngestDate	Wed Jun 26 19:27:53 EDT 2024
IsPeerReviewed	false
IsScholarly	false
Language	English
LinkModel	DirectLink
MergedId	FETCHMERGED-LOGICAL-i175t-544ddb26d1a5d9dabb94e8784947b33c3ecb97159714436069fcd8123d18f55c3
PageCount	10
ParticipantIDs	ieee_primary_8265332
PublicationCentury	2000
PublicationDate	2017-Oct.
PublicationDateYYYYMMDD	2017-10-01
PublicationDate_xml	– month: 10 year: 2017 text: 2017-Oct.
PublicationDecade	2010
PublicationTitle	2017 IEEE International Conference on Computer Vision Workshops (ICCVW)
PublicationTitleAbbrev	ICCVW
PublicationYear	2017
Publisher	IEEE
Publisher_xml	– name: IEEE
SSID	ssj0001615930
Score	1.9725579
Snippet	This paper presents an effective yet simple video representation for RGB-D based action recognition. It proposes to represent a depth map sequence into three...
SourceID	ieee
SourceType	Publisher
StartPage	1005
SubjectTerms	Aggregates Benchmark testing Dynamics Image recognition Periodic structures Skeleton Three-dimensional displays
Title	Structured Images for RGB-D Action Recognition
URI	https://ieeexplore.ieee.org/document/8265332
hasFullText	1
inHoldings	1
isFullTextHit
isPrint
link	http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV09T8MwELVKJ6YCLeJbHhhJ6sR2HI9QKC1SESoUulXxRySESBBtF349ZydtEWJgi7wkZ8t-75z37hA6B1KuhFR5ELFYuAQlCQB3YF-RTGiSaJV7f8XoPhlM2N2UTxvoYu2FsdZ68ZkN3aP_l29KvXRXZV2gwsBO4MDdSklcebU29ykAzZKSui5jRGR32Os9vzjxlggj143oR_cUDx79FhqtXltpRt7C5UKF-utXRcb_ftcO6mxsevhhDUC7qGGLPdSqeSWud-28jcJHXyR2-Qmjw3c4QOYYqCoe314F1_jSGxvweCUkKosOmvRvnnqDoO6TELwC-C8CzpgxKk5MlHEjTaaUZDYVKZNMKEo1tVpJAdMjIHuikLHIXBsAdmqiNOdc033ULMrCHiAslGYmo8AaMg6LB7kfzyKam9QSkZNYHaK2C3_2UZXCmNWRH_09fIy23fRX2rcT1IRg7Slg-EKd-cX7BvJembs
link.rule.ids	310,311,786,790,795,796,802,23958,23959,25170,27956,55107
linkProvider	IEEE
linkToHtml	http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV1NT8JAEJ0QPOgJFYzf7sGjLS277XaPiiIoEIOg3Ej3owkxFiNw8dc73RYwxoO3Zi_d2cnue7P7ZgbgEkm55EImjs8aPAtQQgdxB_eVF3PlhUomNr-i1w_bI_YwDsYluFrnwhhjrPjMuNmnfcvXM7XMrsrqSIWRneCBu4U47_E8W2tzo4LgLKhXVGb0PVHvNJsvr5l8i7t-1o_oR_8UCx-tCvRWP85VI2_uciFd9fWrJuN_Z7YLtU2iHnlaQ9AelEy6D5WCWZJi386r4D7bMrHLTxztvOMRMidIVsng_sa5Jdc2tYEMVlKiWVqDUetu2Gw7RacEZ4rwv3ACxrSWjVD7caCFjqUUzEQ8YoJxSamiRknBcXk4xk8UYxaRKI3QTrUfJUGg6AGU01lqDoFwqZiOKfKGOED3YfQXxD5NdGQ8nngNeQTVzPzJR14MY1JYfvz38AVst4e97qTb6T-ewE7milwJdwplNNycIaIv5Ll15DdAop0P
openUrl	ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=proceeding&rft.title=2017+IEEE+International+Conference+on+Computer+Vision+Workshops+%28ICCVW%29&rft.atitle=Structured+Images+for+RGB-D+Action+Recognition&rft.au=Pichao+Wang&rft.au=Shuang+Wang&rft.au=Zhimin+Gao&rft.au=Yonghong+Hou&rft.date=2017-10-01&rft.pub=IEEE&rft.eissn=2473-9944&rft.spage=1005&rft.epage=1014&rft_id=info:doi/10.1109%2FICCVW.2017.123&rft.externalDocID=8265332