Structured Images for RGB-D Action Recognition
This paper presents an effective yet simple video representation for RGB-D based action recognition. It proposes to represent a depth map sequence into three pairs of structured dynamic images at body, part and joint levels respectively through bidirectional rank pooling. Different from previous wor...
Saved in:
Published in | 2017 IEEE International Conference on Computer Vision Workshops (ICCVW) pp. 1005 - 1014 |
---|---|
Main Authors | , , , , |
Format | Conference Proceeding |
Language | English |
Published |
IEEE
01.10.2017
|
Subjects | |
Online Access | Get full text |
Cover
Loading…
Abstract | This paper presents an effective yet simple video representation for RGB-D based action recognition. It proposes to represent a depth map sequence into three pairs of structured dynamic images at body, part and joint levels respectively through bidirectional rank pooling. Different from previous works that applied one Convolutional Neural Network (ConvNet) for each part/joint separately, one pair of structured dynamic images is constructed from depth maps at each granularity level and serves as the input of a ConvNet. The structured dynamic image not only preserves the spatial-temporal information but also enhances the structure information across both body parts/joints and different temporal scales. In addition, it requires low computational cost and memory to construct. This new representation, referred to as Spatially Structured Dynamic Depth Images (S 2 DDI), aggregates from global to fine-grained levels motion and structure information in a depth sequence, and enables us to fine-tune the existing ConvNet models trained on image data for classification of depth sequences, without a need for training the models afresh. The proposed representation is evaluated on five benchmark datasets, namely, MSRAction3D, G3D, MSRDailyActivity3D, SYSU 3D HOI and UTD-MHAD datasets and achieves the state-of-the-art results on all five datasets. |
---|---|
AbstractList | This paper presents an effective yet simple video representation for RGB-D based action recognition. It proposes to represent a depth map sequence into three pairs of structured dynamic images at body, part and joint levels respectively through bidirectional rank pooling. Different from previous works that applied one Convolutional Neural Network (ConvNet) for each part/joint separately, one pair of structured dynamic images is constructed from depth maps at each granularity level and serves as the input of a ConvNet. The structured dynamic image not only preserves the spatial-temporal information but also enhances the structure information across both body parts/joints and different temporal scales. In addition, it requires low computational cost and memory to construct. This new representation, referred to as Spatially Structured Dynamic Depth Images (S 2 DDI), aggregates from global to fine-grained levels motion and structure information in a depth sequence, and enables us to fine-tune the existing ConvNet models trained on image data for classification of depth sequences, without a need for training the models afresh. The proposed representation is evaluated on five benchmark datasets, namely, MSRAction3D, G3D, MSRDailyActivity3D, SYSU 3D HOI and UTD-MHAD datasets and achieves the state-of-the-art results on all five datasets. |
Author | Zhimin Gao Shuang Wang Yonghong Hou Pichao Wang Wanqing Li |
Author_xml | – sequence: 1 surname: Pichao Wang fullname: Pichao Wang email: pw212@uowmail.edu.au organization: Adv. Multimedia Res. Lab., Univ. of Wollongong, Wollongong, NSW, Australia – sequence: 2 surname: Shuang Wang fullname: Shuang Wang email: wangshuang1993@tju.edu.cn organization: Sch. of Electron. Inf. Eng., Tianjin Univ., Tianjin, China – sequence: 3 surname: Zhimin Gao fullname: Zhimin Gao email: zg126@uowmail.edu.au organization: Adv. Multimedia Res. Lab., Univ. of Wollongong, Wollongong, NSW, Australia – sequence: 4 surname: Yonghong Hou fullname: Yonghong Hou email: houroy@tju.edu.cn organization: Sch. of Electron. Inf. Eng., Tianjin Univ., Tianjin, China – sequence: 5 surname: Wanqing Li fullname: Wanqing Li email: wanqing@uow.edu.au organization: Adv. Multimedia Res. Lab., Univ. of Wollongong, Wollongong, NSW, Australia |
BookMark | eNotjrtOwzAUQA0qEm3pysKSH3Dw9b1-jSVAiVSpUnmNlWM7VRBNUJIO_D1FMJ0zHZ0Zm7Rdmxi7BpEDCHdbFsXbey4FmBwknrEZKLQaBJI6Z1NJBrlzRJdsMQwfQgjQoByKKcufx_4YxmOfYlYe_D4NWd312XZ1x--zZRibrs22KXT7tvn1K3ZR-88hLf45Z6-PDy_FE19vVmWxXPMGjBq5IoqxkjqCV9FFX1WOkjWWHJkKMWAKlTOnBQNEqIV2dYj2dB7B1koFnLObv26TUtp99c3B9987K7VClPgDolFC3Q |
ContentType | Conference Proceeding |
DBID | 6IE 6IL CBEJK RIE RIL |
DOI | 10.1109/ICCVW.2017.123 |
DatabaseName | IEEE Electronic Library (IEL) Conference Proceedings IEEE Proceedings Order Plan All Online (POP All Online) 1998-present by volume IEEE Xplore All Conference Proceedings IEEE Electronic Library Online IEEE Proceedings Order Plans (POP All) 1998-Present |
DatabaseTitleList | |
Database_xml | – sequence: 1 dbid: RIE name: IEEE Electronic Library Online url: https://proxy.k.utb.cz/login?url=https://ieeexplore.ieee.org/ sourceTypes: Publisher |
DeliveryMethod | fulltext_linktorsrc |
Discipline | Applied Sciences |
EISBN | 1538610345 9781538610343 |
EISSN | 2473-9944 |
EndPage | 1014 |
ExternalDocumentID | 8265332 |
Genre | orig-research |
GroupedDBID | 6IE 6IF 6IH 6IK 6IL 6IM 6IN AAJGR ABLEC ADZIZ ALMA_UNASSIGNED_HOLDINGS BEFXN BFFAM BGNUA BKEBE BPEOZ CBEJK CHZPO IEGSK IPLJI OCL RIE RIL RNS |
ID | FETCH-LOGICAL-i175t-544ddb26d1a5d9dabb94e8784947b33c3ecb97159714436069fcd8123d18f55c3 |
IEDL.DBID | RIE |
IngestDate | Wed Jun 26 19:27:53 EDT 2024 |
IsPeerReviewed | false |
IsScholarly | false |
Language | English |
LinkModel | DirectLink |
MergedId | FETCHMERGED-LOGICAL-i175t-544ddb26d1a5d9dabb94e8784947b33c3ecb97159714436069fcd8123d18f55c3 |
PageCount | 10 |
ParticipantIDs | ieee_primary_8265332 |
PublicationCentury | 2000 |
PublicationDate | 2017-Oct. |
PublicationDateYYYYMMDD | 2017-10-01 |
PublicationDate_xml | – month: 10 year: 2017 text: 2017-Oct. |
PublicationDecade | 2010 |
PublicationTitle | 2017 IEEE International Conference on Computer Vision Workshops (ICCVW) |
PublicationTitleAbbrev | ICCVW |
PublicationYear | 2017 |
Publisher | IEEE |
Publisher_xml | – name: IEEE |
SSID | ssj0001615930 |
Score | 1.9725579 |
Snippet | This paper presents an effective yet simple video representation for RGB-D based action recognition. It proposes to represent a depth map sequence into three... |
SourceID | ieee |
SourceType | Publisher |
StartPage | 1005 |
SubjectTerms | Aggregates Benchmark testing Dynamics Image recognition Periodic structures Skeleton Three-dimensional displays |
Title | Structured Images for RGB-D Action Recognition |
URI | https://ieeexplore.ieee.org/document/8265332 |
hasFullText | 1 |
inHoldings | 1 |
isFullTextHit | |
isPrint | |
link | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV09T8MwELVKJ6YCLeJbHhhJ6sR2HI9QKC1SESoUulXxRySESBBtF349ZydtEWJgi7wkZ8t-75z37hA6B1KuhFR5ELFYuAQlCQB3YF-RTGiSaJV7f8XoPhlM2N2UTxvoYu2FsdZ68ZkN3aP_l29KvXRXZV2gwsBO4MDdSklcebU29ykAzZKSui5jRGR32Os9vzjxlggj143oR_cUDx79FhqtXltpRt7C5UKF-utXRcb_ftcO6mxsevhhDUC7qGGLPdSqeSWud-28jcJHXyR2-Qmjw3c4QOYYqCoe314F1_jSGxvweCUkKosOmvRvnnqDoO6TELwC-C8CzpgxKk5MlHEjTaaUZDYVKZNMKEo1tVpJAdMjIHuikLHIXBsAdmqiNOdc033ULMrCHiAslGYmo8AaMg6LB7kfzyKam9QSkZNYHaK2C3_2UZXCmNWRH_09fIy23fRX2rcT1IRg7Slg-EKd-cX7BvJembs |
link.rule.ids | 310,311,786,790,795,796,802,23958,23959,25170,27956,55107 |
linkProvider | IEEE |
linkToHtml | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV1NT8JAEJ0QPOgJFYzf7sGjLS277XaPiiIoEIOg3Ej3owkxFiNw8dc73RYwxoO3Zi_d2cnue7P7ZgbgEkm55EImjs8aPAtQQgdxB_eVF3PlhUomNr-i1w_bI_YwDsYluFrnwhhjrPjMuNmnfcvXM7XMrsrqSIWRneCBu4U47_E8W2tzo4LgLKhXVGb0PVHvNJsvr5l8i7t-1o_oR_8UCx-tCvRWP85VI2_uciFd9fWrJuN_Z7YLtU2iHnlaQ9AelEy6D5WCWZJi386r4D7bMrHLTxztvOMRMidIVsng_sa5Jdc2tYEMVlKiWVqDUetu2Gw7RacEZ4rwv3ACxrSWjVD7caCFjqUUzEQ8YoJxSamiRknBcXk4xk8UYxaRKI3QTrUfJUGg6AGU01lqDoFwqZiOKfKGOED3YfQXxD5NdGQ8nngNeQTVzPzJR14MY1JYfvz38AVst4e97qTb6T-ewE7milwJdwplNNycIaIv5Ll15DdAop0P |
openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=proceeding&rft.title=2017+IEEE+International+Conference+on+Computer+Vision+Workshops+%28ICCVW%29&rft.atitle=Structured+Images+for+RGB-D+Action+Recognition&rft.au=Pichao+Wang&rft.au=Shuang+Wang&rft.au=Zhimin+Gao&rft.au=Yonghong+Hou&rft.date=2017-10-01&rft.pub=IEEE&rft.eissn=2473-9944&rft.spage=1005&rft.epage=1014&rft_id=info:doi/10.1109%2FICCVW.2017.123&rft.externalDocID=8265332 |