Structured Images for RGB-D Action Recognition

This paper presents an effective yet simple video representation for RGB-D based action recognition. It proposes to represent a depth map sequence into three pairs of structured dynamic images at body, part and joint levels respectively through bidirectional rank pooling. Different from previous wor...

Full description

Saved in:
Bibliographic Details
Published in2017 IEEE International Conference on Computer Vision Workshops (ICCVW) pp. 1005 - 1014
Main Authors Pichao Wang, Shuang Wang, Zhimin Gao, Yonghong Hou, Wanqing Li
Format Conference Proceeding
LanguageEnglish
Published IEEE 01.10.2017
Subjects
Online AccessGet full text

Cover

Loading…
Abstract This paper presents an effective yet simple video representation for RGB-D based action recognition. It proposes to represent a depth map sequence into three pairs of structured dynamic images at body, part and joint levels respectively through bidirectional rank pooling. Different from previous works that applied one Convolutional Neural Network (ConvNet) for each part/joint separately, one pair of structured dynamic images is constructed from depth maps at each granularity level and serves as the input of a ConvNet. The structured dynamic image not only preserves the spatial-temporal information but also enhances the structure information across both body parts/joints and different temporal scales. In addition, it requires low computational cost and memory to construct. This new representation, referred to as Spatially Structured Dynamic Depth Images (S 2 DDI), aggregates from global to fine-grained levels motion and structure information in a depth sequence, and enables us to fine-tune the existing ConvNet models trained on image data for classification of depth sequences, without a need for training the models afresh. The proposed representation is evaluated on five benchmark datasets, namely, MSRAction3D, G3D, MSRDailyActivity3D, SYSU 3D HOI and UTD-MHAD datasets and achieves the state-of-the-art results on all five datasets.
AbstractList This paper presents an effective yet simple video representation for RGB-D based action recognition. It proposes to represent a depth map sequence into three pairs of structured dynamic images at body, part and joint levels respectively through bidirectional rank pooling. Different from previous works that applied one Convolutional Neural Network (ConvNet) for each part/joint separately, one pair of structured dynamic images is constructed from depth maps at each granularity level and serves as the input of a ConvNet. The structured dynamic image not only preserves the spatial-temporal information but also enhances the structure information across both body parts/joints and different temporal scales. In addition, it requires low computational cost and memory to construct. This new representation, referred to as Spatially Structured Dynamic Depth Images (S 2 DDI), aggregates from global to fine-grained levels motion and structure information in a depth sequence, and enables us to fine-tune the existing ConvNet models trained on image data for classification of depth sequences, without a need for training the models afresh. The proposed representation is evaluated on five benchmark datasets, namely, MSRAction3D, G3D, MSRDailyActivity3D, SYSU 3D HOI and UTD-MHAD datasets and achieves the state-of-the-art results on all five datasets.
Author Zhimin Gao
Shuang Wang
Yonghong Hou
Pichao Wang
Wanqing Li
Author_xml – sequence: 1
  surname: Pichao Wang
  fullname: Pichao Wang
  email: pw212@uowmail.edu.au
  organization: Adv. Multimedia Res. Lab., Univ. of Wollongong, Wollongong, NSW, Australia
– sequence: 2
  surname: Shuang Wang
  fullname: Shuang Wang
  email: wangshuang1993@tju.edu.cn
  organization: Sch. of Electron. Inf. Eng., Tianjin Univ., Tianjin, China
– sequence: 3
  surname: Zhimin Gao
  fullname: Zhimin Gao
  email: zg126@uowmail.edu.au
  organization: Adv. Multimedia Res. Lab., Univ. of Wollongong, Wollongong, NSW, Australia
– sequence: 4
  surname: Yonghong Hou
  fullname: Yonghong Hou
  email: houroy@tju.edu.cn
  organization: Sch. of Electron. Inf. Eng., Tianjin Univ., Tianjin, China
– sequence: 5
  surname: Wanqing Li
  fullname: Wanqing Li
  email: wanqing@uow.edu.au
  organization: Adv. Multimedia Res. Lab., Univ. of Wollongong, Wollongong, NSW, Australia
BookMark eNotjrtOwzAUQA0qEm3pysKSH3Dw9b1-jSVAiVSpUnmNlWM7VRBNUJIO_D1FMJ0zHZ0Zm7Rdmxi7BpEDCHdbFsXbey4FmBwknrEZKLQaBJI6Z1NJBrlzRJdsMQwfQgjQoByKKcufx_4YxmOfYlYe_D4NWd312XZ1x--zZRibrs22KXT7tvn1K3ZR-88hLf45Z6-PDy_FE19vVmWxXPMGjBq5IoqxkjqCV9FFX1WOkjWWHJkKMWAKlTOnBQNEqIV2dYj2dB7B1koFnLObv26TUtp99c3B9987K7VClPgDolFC3Q
ContentType Conference Proceeding
DBID 6IE
6IL
CBEJK
RIE
RIL
DOI 10.1109/ICCVW.2017.123
DatabaseName IEEE Electronic Library (IEL) Conference Proceedings
IEEE Proceedings Order Plan All Online (POP All Online) 1998-present by volume
IEEE Xplore All Conference Proceedings
IEEE Electronic Library Online
IEEE Proceedings Order Plans (POP All) 1998-Present
DatabaseTitleList
Database_xml – sequence: 1
  dbid: RIE
  name: IEEE Electronic Library Online
  url: https://proxy.k.utb.cz/login?url=https://ieeexplore.ieee.org/
  sourceTypes: Publisher
DeliveryMethod fulltext_linktorsrc
Discipline Applied Sciences
EISBN 1538610345
9781538610343
EISSN 2473-9944
EndPage 1014
ExternalDocumentID 8265332
Genre orig-research
GroupedDBID 6IE
6IF
6IH
6IK
6IL
6IM
6IN
AAJGR
ABLEC
ADZIZ
ALMA_UNASSIGNED_HOLDINGS
BEFXN
BFFAM
BGNUA
BKEBE
BPEOZ
CBEJK
CHZPO
IEGSK
IPLJI
OCL
RIE
RIL
RNS
ID FETCH-LOGICAL-i175t-544ddb26d1a5d9dabb94e8784947b33c3ecb97159714436069fcd8123d18f55c3
IEDL.DBID RIE
IngestDate Wed Jun 26 19:27:53 EDT 2024
IsPeerReviewed false
IsScholarly false
Language English
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-i175t-544ddb26d1a5d9dabb94e8784947b33c3ecb97159714436069fcd8123d18f55c3
PageCount 10
ParticipantIDs ieee_primary_8265332
PublicationCentury 2000
PublicationDate 2017-Oct.
PublicationDateYYYYMMDD 2017-10-01
PublicationDate_xml – month: 10
  year: 2017
  text: 2017-Oct.
PublicationDecade 2010
PublicationTitle 2017 IEEE International Conference on Computer Vision Workshops (ICCVW)
PublicationTitleAbbrev ICCVW
PublicationYear 2017
Publisher IEEE
Publisher_xml – name: IEEE
SSID ssj0001615930
Score 1.9725579
Snippet This paper presents an effective yet simple video representation for RGB-D based action recognition. It proposes to represent a depth map sequence into three...
SourceID ieee
SourceType Publisher
StartPage 1005
SubjectTerms Aggregates
Benchmark testing
Dynamics
Image recognition
Periodic structures
Skeleton
Three-dimensional displays
Title Structured Images for RGB-D Action Recognition
URI https://ieeexplore.ieee.org/document/8265332
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
link http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV09T8MwELVKJ6YCLeJbHhhJ6sR2HI9QKC1SESoUulXxRySESBBtF349ZydtEWJgi7wkZ8t-75z37hA6B1KuhFR5ELFYuAQlCQB3YF-RTGiSaJV7f8XoPhlM2N2UTxvoYu2FsdZ68ZkN3aP_l29KvXRXZV2gwsBO4MDdSklcebU29ykAzZKSui5jRGR32Os9vzjxlggj143oR_cUDx79FhqtXltpRt7C5UKF-utXRcb_ftcO6mxsevhhDUC7qGGLPdSqeSWud-28jcJHXyR2-Qmjw3c4QOYYqCoe314F1_jSGxvweCUkKosOmvRvnnqDoO6TELwC-C8CzpgxKk5MlHEjTaaUZDYVKZNMKEo1tVpJAdMjIHuikLHIXBsAdmqiNOdc033ULMrCHiAslGYmo8AaMg6LB7kfzyKam9QSkZNYHaK2C3_2UZXCmNWRH_09fIy23fRX2rcT1IRg7Slg-EKd-cX7BvJembs
link.rule.ids 310,311,786,790,795,796,802,23958,23959,25170,27956,55107
linkProvider IEEE
linkToHtml http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV1NT8JAEJ0QPOgJFYzf7sGjLS277XaPiiIoEIOg3Ej3owkxFiNw8dc73RYwxoO3Zi_d2cnue7P7ZgbgEkm55EImjs8aPAtQQgdxB_eVF3PlhUomNr-i1w_bI_YwDsYluFrnwhhjrPjMuNmnfcvXM7XMrsrqSIWRneCBu4U47_E8W2tzo4LgLKhXVGb0PVHvNJsvr5l8i7t-1o_oR_8UCx-tCvRWP85VI2_uciFd9fWrJuN_Z7YLtU2iHnlaQ9AelEy6D5WCWZJi386r4D7bMrHLTxztvOMRMidIVsng_sa5Jdc2tYEMVlKiWVqDUetu2Gw7RacEZ4rwv3ACxrSWjVD7caCFjqUUzEQ8YoJxSamiRknBcXk4xk8UYxaRKI3QTrUfJUGg6AGU01lqDoFwqZiOKfKGOED3YfQXxD5NdGQ8nngNeQTVzPzJR14MY1JYfvz38AVst4e97qTb6T-ewE7milwJdwplNNycIaIv5Ll15DdAop0P
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=proceeding&rft.title=2017+IEEE+International+Conference+on+Computer+Vision+Workshops+%28ICCVW%29&rft.atitle=Structured+Images+for+RGB-D+Action+Recognition&rft.au=Pichao+Wang&rft.au=Shuang+Wang&rft.au=Zhimin+Gao&rft.au=Yonghong+Hou&rft.date=2017-10-01&rft.pub=IEEE&rft.eissn=2473-9944&rft.spage=1005&rft.epage=1014&rft_id=info:doi/10.1109%2FICCVW.2017.123&rft.externalDocID=8265332