SpatioTemporal focus for skeleton-based action recognition

•The multi-scale representations of the skeleton improve the performance of recognition.•Learning different temporal dynamics according to different action instances.•The model based on GCN and attention captures the topology information of skeleton.•A complementary representation of video against R...

Full description

Saved in:
Bibliographic Details
Published inPattern recognition Vol. 136; p. 109231
Main Authors Wu, Liyu, Zhang, Can, Zou, Yuexian
Format Journal Article
LanguageEnglish
Published Elsevier Ltd 01.04.2023
Subjects
Online AccessGet full text

Cover

Loading…
Abstract •The multi-scale representations of the skeleton improve the performance of recognition.•Learning different temporal dynamics according to different action instances.•The model based on GCN and attention captures the topology information of skeleton.•A complementary representation of video against RGB modality. Graph convolutional networks (GCNs) are widely adopted in skeleton-based action recognition due to their powerful ability to model data topology. We argue that the performance of recent proposed skeleton-based action recognition methods is limited by the following factors. First, the predefined graph structures are shared throughout the network, lacking the flexibility and capacity to model the multi-grain semantic information. Second, the relations among the global joints are not fully exploited by the graph local convolution, which may lose the implicit joint relevance. For instance, actions such as running and waving are performed by the co-movement of body parts and joints, e.g., legs and arms, however, they are located far away in physical connection. Inspired by the recent attention mechanism, we propose a multi-grain contextual focus module, termed MCF, to capture the action associated relation information from the body joints and parts. As a result, more explainable representations for different skeleton action sequences can be obtained by MCF. In this study, we follow the common practice that the dense sample strategy of the input skeleton sequences is adopted and this brings much redundancy since number of instances has nothing to do with actions. To reduce the redundancy, a temporal discrimination focus module, termed TDF, is developed to capture the local sensitive points of the temporal dynamics. MCF and TDF are integrated into the standard GCN network to form a unified architecture, named STF-Net. It is noted that STF-Net provides the capability to capture robust movement patterns from these skeleton topology structures, based on multi-grain context aggregation and temporal dependency. Extensive experimental results show that our STF-Net significantly achieves state-of-the-art results on three challenging benchmarks NTU-RGB+D 60, NTU-RGB+D 120, and Kinetics-Skeleton.
AbstractList •The multi-scale representations of the skeleton improve the performance of recognition.•Learning different temporal dynamics according to different action instances.•The model based on GCN and attention captures the topology information of skeleton.•A complementary representation of video against RGB modality. Graph convolutional networks (GCNs) are widely adopted in skeleton-based action recognition due to their powerful ability to model data topology. We argue that the performance of recent proposed skeleton-based action recognition methods is limited by the following factors. First, the predefined graph structures are shared throughout the network, lacking the flexibility and capacity to model the multi-grain semantic information. Second, the relations among the global joints are not fully exploited by the graph local convolution, which may lose the implicit joint relevance. For instance, actions such as running and waving are performed by the co-movement of body parts and joints, e.g., legs and arms, however, they are located far away in physical connection. Inspired by the recent attention mechanism, we propose a multi-grain contextual focus module, termed MCF, to capture the action associated relation information from the body joints and parts. As a result, more explainable representations for different skeleton action sequences can be obtained by MCF. In this study, we follow the common practice that the dense sample strategy of the input skeleton sequences is adopted and this brings much redundancy since number of instances has nothing to do with actions. To reduce the redundancy, a temporal discrimination focus module, termed TDF, is developed to capture the local sensitive points of the temporal dynamics. MCF and TDF are integrated into the standard GCN network to form a unified architecture, named STF-Net. It is noted that STF-Net provides the capability to capture robust movement patterns from these skeleton topology structures, based on multi-grain context aggregation and temporal dependency. Extensive experimental results show that our STF-Net significantly achieves state-of-the-art results on three challenging benchmarks NTU-RGB+D 60, NTU-RGB+D 120, and Kinetics-Skeleton.
ArticleNumber 109231
Author Wu, Liyu
Zhang, Can
Zou, Yuexian
Author_xml – sequence: 1
  givenname: Liyu
  orcidid: 0000-0002-9040-6623
  surname: Wu
  fullname: Wu, Liyu
  email: wuliyu@pku.edu.cn
  organization: ADSPLAB, School of ECE, Peking University, Shenzhen, China
– sequence: 2
  givenname: Can
  surname: Zhang
  fullname: Zhang, Can
  email: cannyzhang@tencent.com
  organization: Tencent Media Lab, Shenzhen, China
– sequence: 3
  givenname: Yuexian
  surname: Zou
  fullname: Zou, Yuexian
  email: zouyx@pku.edu.cn
  organization: ADSPLAB, School of ECE, Peking University, Shenzhen, China
BookMark eNqFkM1KAzEQgINUsK2-gYd9ga3JZLs_PQhStAoFD9ZzyCYTybpNSrIKvr1Z1pMHvSTDJN_8fAsyc94hIdeMrhhl5U23OslB-bcVUICUaoCzMzJndcXzNStgRuaUcpZzoPyCLGLsKGVVepiTzUtCrT_g8eSD7DPj1UdMZ8jiO_Y4eJe3MqLOpErfXBYw9XF2jC_JuZF9xKufe0leH-4P28d8_7x72t7tc8UrGPLWcCMZ0FYDVYVErgvd1I1kY4i8lKVpmrqEqlZ1uTZS1g0oaE0JnLe6onxJNlNdFXyMAY1QdhiHdkOQtheMitGC6MRkQYwWxGQhwcUv-BTsUYav_7DbCcO02KfFIKKy6BRqmwwMQnv7d4FvDpR7ug
CitedBy_id crossref_primary_10_1007_s11042_024_18864_y
crossref_primary_10_1016_j_imavis_2024_104919
crossref_primary_10_1016_j_patcog_2023_110199
crossref_primary_10_3390_s23125414
crossref_primary_10_1016_j_knosys_2024_112319
crossref_primary_10_3390_s25061769
crossref_primary_10_1007_s13735_023_00301_9
crossref_primary_10_1016_j_cviu_2024_103992
crossref_primary_10_1007_s10489_024_05544_5
crossref_primary_10_1016_j_knosys_2023_111074
crossref_primary_10_1007_s11760_024_03259_1
crossref_primary_10_1016_j_sigpro_2024_109592
crossref_primary_10_1007_s11227_024_06531_w
crossref_primary_10_1587_transinf_2023EDP7223
crossref_primary_10_1109_ACCESS_2024_3452553
crossref_primary_10_1016_j_patcog_2023_109528
crossref_primary_10_1049_cvi2_12296
crossref_primary_10_1109_TGRS_2024_3416112
crossref_primary_10_3390_s24082519
crossref_primary_10_1016_j_eswa_2024_124013
crossref_primary_10_1016_j_patcog_2023_110188
crossref_primary_10_1016_j_patcog_2023_110087
crossref_primary_10_1016_j_patcog_2023_109455
crossref_primary_10_3390_app14188185
crossref_primary_10_1016_j_patcog_2024_110427
crossref_primary_10_1016_j_patcog_2023_110209
crossref_primary_10_1016_j_imavis_2024_104991
crossref_primary_10_3390_s23249738
crossref_primary_10_1007_s40747_025_01811_1
crossref_primary_10_1016_j_aej_2025_01_118
crossref_primary_10_3390_s24061908
crossref_primary_10_1016_j_patcog_2024_111151
crossref_primary_10_3390_s24082567
Cites_doi 10.1016/j.patcog.2021.108044
10.1016/j.patcog.2021.107921
10.1016/j.patcog.2022.108520
10.1016/j.patcog.2021.108360
10.1109/TPAMI.2019.2916873
10.1109/TIP.2020.3028207
10.1109/TPAMI.2018.2868668
10.1109/TPAMI.2019.2896631
10.1016/j.patcog.2020.107511
10.1016/j.cviu.2021.103348
ContentType Journal Article
Copyright 2022 Elsevier Ltd
Copyright_xml – notice: 2022 Elsevier Ltd
DBID AAYXX
CITATION
DOI 10.1016/j.patcog.2022.109231
DatabaseName CrossRef
DatabaseTitle CrossRef
DatabaseTitleList
DeliveryMethod fulltext_linktorsrc
Discipline Computer Science
EISSN 1873-5142
ExternalDocumentID 10_1016_j_patcog_2022_109231
S0031320322007105
GroupedDBID --K
--M
-D8
-DT
-~X
.DC
.~1
0R~
123
1B1
1RT
1~.
1~5
29O
4.4
457
4G.
53G
5VS
7-5
71M
8P~
9JN
AABNK
AACTN
AAEDT
AAEDW
AAIAV
AAIKJ
AAKOC
AALRI
AAOAW
AAQFI
AAQXK
AAXUO
AAYFN
ABBOA
ABEFU
ABFNM
ABFRF
ABHFT
ABJNI
ABMAC
ABTAH
ABXDB
ABYKQ
ACBEA
ACDAQ
ACGFO
ACGFS
ACNNM
ACRLP
ACZNC
ADBBV
ADEZE
ADJOM
ADMUD
ADMXK
ADTZH
AEBSH
AECPX
AEFWE
AEKER
AENEX
AFKWA
AFTJW
AGHFR
AGUBO
AGYEJ
AHHHB
AHJVU
AHZHX
AIALX
AIEXJ
AIKHN
AITUG
AJBFU
AJOXV
ALMA_UNASSIGNED_HOLDINGS
AMFUW
AMRAJ
AOUOD
ASPBG
AVWKF
AXJTR
AZFZN
BJAXD
BKOJK
BLXMC
CS3
DU5
EBS
EFJIC
EFLBG
EJD
EO8
EO9
EP2
EP3
F0J
F5P
FD6
FDB
FEDTE
FGOYB
FIRID
FNPLU
FYGXN
G-Q
G8K
GBLVA
GBOLZ
HLZ
HVGLF
HZ~
H~9
IHE
J1W
JJJVA
KOM
KZ1
LG9
LMP
LY1
M41
MO0
N9A
O-L
O9-
OAUVE
OZT
P-8
P-9
P2P
PC.
Q38
R2-
RIG
RNS
ROL
RPZ
SBC
SDF
SDG
SDP
SDS
SES
SEW
SPC
SPCBC
SST
SSV
SSZ
T5K
TN5
UNMZH
VOH
WUQ
XJE
XPP
ZMT
ZY4
~G-
AATTM
AAXKI
AAYWO
AAYXX
ABDPE
ABWVN
ACRPL
ACVFH
ADCNI
ADNMO
AEIPS
AEUPX
AFJKZ
AFPUW
AFXIZ
AGCQF
AGQPQ
AGRNS
AIGII
AIIUN
AKBMS
AKRWK
AKYEP
ANKPU
APXCP
BNPGV
CITATION
SSH
ID FETCH-LOGICAL-c372t-bf3fa120bd20c4ae3d4d989a1ae3de36a6f9986278c865faa892c2bf6233bd703
IEDL.DBID .~1
ISSN 0031-3203
IngestDate Tue Jul 01 02:36:40 EDT 2025
Thu Apr 24 23:01:16 EDT 2025
Fri Feb 23 02:39:24 EST 2024
IsPeerReviewed true
IsScholarly true
Keywords Action recognition
Graph convolutional network
Skeleton topology
Language English
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-c372t-bf3fa120bd20c4ae3d4d989a1ae3de36a6f9986278c865faa892c2bf6233bd703
ORCID 0000-0002-9040-6623
ParticipantIDs crossref_citationtrail_10_1016_j_patcog_2022_109231
crossref_primary_10_1016_j_patcog_2022_109231
elsevier_sciencedirect_doi_10_1016_j_patcog_2022_109231
ProviderPackageCode CITATION
AAYXX
PublicationCentury 2000
PublicationDate April 2023
2023-04-00
PublicationDateYYYYMMDD 2023-04-01
PublicationDate_xml – month: 04
  year: 2023
  text: April 2023
PublicationDecade 2020
PublicationTitle Pattern recognition
PublicationYear 2023
Publisher Elsevier Ltd
Publisher_xml – name: Elsevier Ltd
References Wang, Girshick, Gupta, He (bib0014) 2018
Liu, You, He, Bi, Wang (bib0003) 2022; 125
Alsarhan, Ali, Lu (bib0040) 2022; 216
Plizzari, Cannici, Matteucci (bib0022) 2021
Cao, Simon, Wei, Sheikh (bib0004) 2017
Shahroudy, Liu, Ng, Wang (bib0021) 2016
Liu, Zhang, Chen, Wang, Ouyang (bib0026) 2020
Peng, Hong, Chen, Zhao (bib0035) 2020; Vol. 34
Guo, He, Zhang, Zhao, Fang, Tan (bib0002) 2021; 118
Wang, Chen, Jiang, Song, Han, Huang (bib0013) 2021
Shi, Zhang, Cheng, Lu (bib0010) 2019
Wang, Xiong, Wang, Qiao, Lin, Tang, Van Gool (bib0017) 2018; 41
Xie, Girshick, Dollár, Tu, He (bib0027) 2017
Liu, Shahroudy, Perez, Wang, Duan, Kot (bib0029) 2019; 42
Carreira, Zisserman (bib0019) 2017
Peng, Hong, Zhao (bib0011) 2021; 115
Zhang, Zou, Chen, Gan (bib0020) 2019
Caetano, Jessica, Brémond, Dos Santos, Schwartz (bib0007) 2019
Yang, Wang, Dantcheva, Garattoni, Francesca, Bremond (bib0031) 2021
Si, Jing, Wang, Wang, Tan (bib0033) 2020; 107
Sadanand, Corso (bib0016) 2012
Yan, Xiong, Lin (bib0005) 2018
Shi, Zhang, Cheng, Lu (bib0028) 2020; 29
Lin, Gan, Han (bib0018) 2019
Shi, Zhang, Cheng, Lu (bib0008) 2019
Miao, Hou, Gao, Xu, Li (bib0039) 2021
Defferrard, Bresson, Vandergheynst (bib0023) 2016; 29
Perez, Liu, Kot (bib0012) 2022; 122
Yang, Dai, Wang, Mallick, Minciullo, Francesca, Bremond (bib0030) 2021
Zhang, Lan, Zeng, Xing, Xue, Zheng (bib0037) 2020
Yang, Zou (bib0001) 2020
Li, Chen, Chen, Zhang, Wang, Tian (bib0009) 2019
Song, Zhang, Shan, Wang (bib0038) 2020
Hu, Shen, Sun (bib0015) 2018
Zhang, Xu, Tao (bib0025) 2020
Huang, Huang, Ouyang, Wang (bib0034) 2020; Vol. 34
Zhang, Lan, Xing, Zeng, Xue, Zheng (bib0032) 2019; 41
Niepert, Ahmed, Kutzkov (bib0024) 2016
Liu, Shahroudy, Xu, Wang (bib0006) 2016
Cheng, Zhang, He, Chen, Cheng, Lu (bib0036) 2020
Yang (10.1016/j.patcog.2022.109231_bib0001) 2020
Perez (10.1016/j.patcog.2022.109231_bib0012) 2022; 122
Sadanand (10.1016/j.patcog.2022.109231_bib0016) 2012
Zhang (10.1016/j.patcog.2022.109231_bib0037) 2020
Shi (10.1016/j.patcog.2022.109231_bib0008) 2019
Defferrard (10.1016/j.patcog.2022.109231_bib0023) 2016; 29
Yang (10.1016/j.patcog.2022.109231_bib0031) 2021
Carreira (10.1016/j.patcog.2022.109231_bib0019) 2017
Liu (10.1016/j.patcog.2022.109231_bib0006) 2016
Peng (10.1016/j.patcog.2022.109231_bib0011) 2021; 115
Shahroudy (10.1016/j.patcog.2022.109231_bib0021) 2016
Wang (10.1016/j.patcog.2022.109231_bib0013) 2021
Zhang (10.1016/j.patcog.2022.109231_bib0025) 2020
Li (10.1016/j.patcog.2022.109231_bib0009) 2019
Niepert (10.1016/j.patcog.2022.109231_bib0024) 2016
Xie (10.1016/j.patcog.2022.109231_bib0027) 2017
Si (10.1016/j.patcog.2022.109231_bib0033) 2020; 107
Wang (10.1016/j.patcog.2022.109231_bib0017) 2018; 41
Liu (10.1016/j.patcog.2022.109231_bib0003) 2022; 125
Shi (10.1016/j.patcog.2022.109231_bib0010) 2019
Miao (10.1016/j.patcog.2022.109231_bib0039) 2021
Wang (10.1016/j.patcog.2022.109231_bib0014) 2018
Alsarhan (10.1016/j.patcog.2022.109231_bib0040) 2022; 216
Yan (10.1016/j.patcog.2022.109231_bib0005) 2018
Plizzari (10.1016/j.patcog.2022.109231_bib0022) 2021
Liu (10.1016/j.patcog.2022.109231_bib0029) 2019; 42
Guo (10.1016/j.patcog.2022.109231_bib0002) 2021; 118
Liu (10.1016/j.patcog.2022.109231_bib0026) 2020
Song (10.1016/j.patcog.2022.109231_bib0038) 2020
Cao (10.1016/j.patcog.2022.109231_bib0004) 2017
Huang (10.1016/j.patcog.2022.109231_bib0034) 2020; Vol. 34
Yang (10.1016/j.patcog.2022.109231_bib0030) 2021
Shi (10.1016/j.patcog.2022.109231_bib0028) 2020; 29
Peng (10.1016/j.patcog.2022.109231_bib0035) 2020; Vol. 34
Hu (10.1016/j.patcog.2022.109231_bib0015) 2018
Lin (10.1016/j.patcog.2022.109231_bib0018) 2019
Cheng (10.1016/j.patcog.2022.109231_bib0036) 2020
Zhang (10.1016/j.patcog.2022.109231_bib0020) 2019
Caetano (10.1016/j.patcog.2022.109231_bib0007) 2019
Zhang (10.1016/j.patcog.2022.109231_bib0032) 2019; 41
References_xml – year: 2021
  ident: bib0039
  article-title: A central difference graph convolutional operator for skeleton-based action recognition
  publication-title: IEEE Trans. Circuits Syst. Video Technol.
– start-page: 16249
  year: 2021
  end-page: 16258
  ident: bib0013
  article-title: Adaptive focus for efficient video recognition
  publication-title: Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV)
– start-page: 1112
  year: 2020
  end-page: 1121
  ident: bib0037
  article-title: Semantics-guided neural networks for efficient skeleton-based human action recognition
  publication-title: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition
– start-page: 1492
  year: 2017
  end-page: 1500
  ident: bib0027
  article-title: Aggregated residual transformations for deep neural networks
  publication-title: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition
– start-page: 14333
  year: 2020
  end-page: 14342
  ident: bib0025
  article-title: Context aware graph convolution for skeleton-based action recognition
  publication-title: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition
– volume: 29
  year: 2016
  ident: bib0023
  article-title: Convolutional neural networks on graphs with fast localized spectral filtering
  publication-title: Adv. Neural Inform. Process. Syst.
– year: 2018
  ident: bib0014
  article-title: Non-local neural networks
  publication-title: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR)
– start-page: 694
  year: 2021
  end-page: 701
  ident: bib0022
  article-title: Spatial temporal transformer network for skeleton-based action recognition
  publication-title: International Conference on Pattern Recognition
– start-page: 816
  year: 2016
  end-page: 833
  ident: bib0006
  article-title: Spatio-temporal LSTM with trust gates for 3D human action recognition
  publication-title: Computer Vision – ECCV 2016
– volume: 118
  start-page: 108044
  year: 2021
  ident: bib0002
  article-title: Normalized edge convolutional networks for skeleton-based hand gesture recognition
  publication-title: Pattern Recognit.
– volume: 41
  start-page: 1963
  year: 2019
  end-page: 1978
  ident: bib0032
  article-title: View adaptive neural networks for high performance skeleton-based human action recognition
  publication-title: IEEE Trans. Pattern Anal. Mach.Intell.
– year: 2021
  ident: bib0031
  article-title: UNIK: a unified framework for real-world skeleton-based action recognition
  publication-title: British Machine Vision Conference (BMVC)
– volume: Vol. 34
  start-page: 2669
  year: 2020
  end-page: 2676
  ident: bib0035
  article-title: Learning graph convolutional network for skeleton-based human action recognition by neural searching
  publication-title: Proceedings of the AAAI Conference on Artificial Intelligence
– start-page: 1625
  year: 2020
  end-page: 1633
  ident: bib0038
  article-title: Stronger, faster and more explainable: a graph convolutional baseline for skeleton-based action recognition
  publication-title: Proceedings of the 28th ACM International Conference on Multimedia
– volume: 216
  start-page: 103348
  year: 2022
  ident: bib0040
  article-title: Enhanced discriminative graph convolutional network with adaptive temporal modelling for skeleton-based action recognition
  publication-title: Comput. Vis. Image Understanding
– volume: 125
  start-page: 108520
  year: 2022
  ident: bib0003
  article-title: Symmetry-driven hyper feature GCN for skeleton-based gait recognition
  publication-title: Pattern Recognit.
– start-page: 1010
  year: 2016
  end-page: 1019
  ident: bib0021
  article-title: NTU RGB+D: a large scale dataset for 3D human activity analysis
  publication-title: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition
– start-page: 1
  year: 2019
  end-page: 8
  ident: bib0007
  article-title: SkeleMotion: a new representation of skeleton joint sequences based on motion information for 3D action recognition
  publication-title: 2019 16th IEEE international Conference on Advanced Video and Signal Based Surveillance (AVSS)
– volume: 41
  start-page: 2740
  year: 2018
  end-page: 2755
  ident: bib0017
  article-title: Temporal segment networks for action recognition in videos
  publication-title: IEEE Trans. Pattern Anal. Mach.Intell.
– start-page: 2363
  year: 2021
  end-page: 2372
  ident: bib0030
  article-title: Selective spatio-temporal aggregation based pose refinement system: towards understanding human activities in real-world videos
  publication-title: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision
– start-page: 1111
  year: 2020
  end-page: 1117
  ident: bib0001
  article-title: A graph-based interactive reasoning for human-object interaction detection
  publication-title: Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence, IJCAI-20
– start-page: 7912
  year: 2019
  end-page: 7921
  ident: bib0010
  article-title: Skeleton-based action recognition with directed graph neural networks
  publication-title: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition
– year: 2018
  ident: bib0005
  article-title: Spatial temporal graph convolutional networks for skeleton-based action recognition
  publication-title: Thirty-second AAAI Conference on Artificial Intelligence
– start-page: 7132
  year: 2018
  end-page: 7141
  ident: bib0015
  article-title: Squeeze-and-excitation networks
  publication-title: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition
– start-page: 7083
  year: 2019
  end-page: 7093
  ident: bib0018
  article-title: TSM: temporal shift module for efficient video understanding
  publication-title: Proceedings of the IEEE/CVF International Conference on Computer Vision
– volume: Vol. 34
  start-page: 11045
  year: 2020
  end-page: 11052
  ident: bib0034
  article-title: Part-level graph convolutional network for skeleton-based action recognition
  publication-title: Proceedings of the AAAI Conference on Artificial Intelligence
– start-page: 7291
  year: 2017
  end-page: 7299
  ident: bib0004
  article-title: Realtime multi-person 2D pose estimation using part affinity fields
  publication-title: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition
– start-page: 12026
  year: 2019
  end-page: 12035
  ident: bib0008
  article-title: Two-stream adaptive graph convolutional networks for skeleton-based action recognition
  publication-title: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition
– start-page: 183
  year: 2020
  end-page: 192
  ident: bib0036
  article-title: Skeleton-based action recognition with shift graph convolutional network
  publication-title: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition
– volume: 115
  start-page: 107921
  year: 2021
  ident: bib0011
  article-title: Tripool: Graph triplet pooling for 3D skeleton-based action recognition
  publication-title: Pattern Recognit.
– volume: 29
  start-page: 9532
  year: 2020
  end-page: 9545
  ident: bib0028
  article-title: Skeleton-based action recognition with multi-stream adaptive graph convolutional networks
  publication-title: IEEE Trans. Image Process.
– volume: 122
  start-page: 108360
  year: 2022
  ident: bib0012
  article-title: Skeleton-based relational reasoning for group activity analysis
  publication-title: Pattern Recognit.
– year: 2019
  ident: bib0009
  article-title: Actional-structural graph convolutional networks for skeleton-based action recognition
  publication-title: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)
– start-page: 500
  year: 2019
  end-page: 509
  ident: bib0020
  article-title: PAN: persistent appearance network with an efficient motion cue for fast action recognition
  publication-title: Proceedings of the 27th ACM International Conference on Multimedia
– start-page: 6299
  year: 2017
  end-page: 6308
  ident: bib0019
  article-title: Quo vadis, action recognition? A new model and the kinetics dataset
  publication-title: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition
– year: 2020
  ident: bib0026
  article-title: Disentangling and unifying graph convolutions for skeleton-based action recognition
  publication-title: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)
– volume: 42
  start-page: 2684
  year: 2019
  end-page: 2701
  ident: bib0029
  article-title: NTU RGB+D 120: a large-scale benchmark for 3d human activity understanding
  publication-title: IEEE Trans. Pattern Anal. Mach.Intell.
– start-page: 2014
  year: 2016
  end-page: 2023
  ident: bib0024
  article-title: Learning convolutional neural networks for graphs
  publication-title: International Conference on Machine Learning
– start-page: 1234
  year: 2012
  end-page: 1241
  ident: bib0016
  article-title: Action bank: a high-level representation of activity in video
  publication-title: 2012 IEEE Conference on Computer Vision and Pattern Recognition
– volume: 107
  start-page: 107511
  year: 2020
  ident: bib0033
  article-title: Skeleton-based action recognition with hierarchical spatial reasoning and temporal stack learning network
  publication-title: Pattern Recognit.
– year: 2018
  ident: 10.1016/j.patcog.2022.109231_bib0005
  article-title: Spatial temporal graph convolutional networks for skeleton-based action recognition
– start-page: 12026
  year: 2019
  ident: 10.1016/j.patcog.2022.109231_bib0008
  article-title: Two-stream adaptive graph convolutional networks for skeleton-based action recognition
– volume: 118
  start-page: 108044
  year: 2021
  ident: 10.1016/j.patcog.2022.109231_bib0002
  article-title: Normalized edge convolutional networks for skeleton-based hand gesture recognition
  publication-title: Pattern Recognit.
  doi: 10.1016/j.patcog.2021.108044
– volume: 115
  start-page: 107921
  year: 2021
  ident: 10.1016/j.patcog.2022.109231_bib0011
  article-title: Tripool: Graph triplet pooling for 3D skeleton-based action recognition
  publication-title: Pattern Recognit.
  doi: 10.1016/j.patcog.2021.107921
– start-page: 7132
  year: 2018
  ident: 10.1016/j.patcog.2022.109231_bib0015
  article-title: Squeeze-and-excitation networks
– start-page: 1010
  year: 2016
  ident: 10.1016/j.patcog.2022.109231_bib0021
  article-title: NTU RGB+D: a large scale dataset for 3D human activity analysis
– start-page: 16249
  year: 2021
  ident: 10.1016/j.patcog.2022.109231_bib0013
  article-title: Adaptive focus for efficient video recognition
– volume: 29
  year: 2016
  ident: 10.1016/j.patcog.2022.109231_bib0023
  article-title: Convolutional neural networks on graphs with fast localized spectral filtering
  publication-title: Adv. Neural Inform. Process. Syst.
– volume: 125
  start-page: 108520
  year: 2022
  ident: 10.1016/j.patcog.2022.109231_bib0003
  article-title: Symmetry-driven hyper feature GCN for skeleton-based gait recognition
  publication-title: Pattern Recognit.
  doi: 10.1016/j.patcog.2022.108520
– start-page: 6299
  year: 2017
  ident: 10.1016/j.patcog.2022.109231_bib0019
  article-title: Quo vadis, action recognition? A new model and the kinetics dataset
– year: 2021
  ident: 10.1016/j.patcog.2022.109231_bib0039
  article-title: A central difference graph convolutional operator for skeleton-based action recognition
  publication-title: IEEE Trans. Circuits Syst. Video Technol.
– volume: 122
  start-page: 108360
  year: 2022
  ident: 10.1016/j.patcog.2022.109231_bib0012
  article-title: Skeleton-based relational reasoning for group activity analysis
  publication-title: Pattern Recognit.
  doi: 10.1016/j.patcog.2021.108360
– volume: 42
  start-page: 2684
  issue: 10
  year: 2019
  ident: 10.1016/j.patcog.2022.109231_bib0029
  article-title: NTU RGB+D 120: a large-scale benchmark for 3d human activity understanding
  publication-title: IEEE Trans. Pattern Anal. Mach.Intell.
  doi: 10.1109/TPAMI.2019.2916873
– start-page: 7083
  year: 2019
  ident: 10.1016/j.patcog.2022.109231_bib0018
  article-title: TSM: temporal shift module for efficient video understanding
– start-page: 1625
  year: 2020
  ident: 10.1016/j.patcog.2022.109231_bib0038
  article-title: Stronger, faster and more explainable: a graph convolutional baseline for skeleton-based action recognition
– start-page: 1492
  year: 2017
  ident: 10.1016/j.patcog.2022.109231_bib0027
  article-title: Aggregated residual transformations for deep neural networks
– year: 2021
  ident: 10.1016/j.patcog.2022.109231_bib0031
  article-title: UNIK: a unified framework for real-world skeleton-based action recognition
– year: 2018
  ident: 10.1016/j.patcog.2022.109231_bib0014
  article-title: Non-local neural networks
– start-page: 1
  year: 2019
  ident: 10.1016/j.patcog.2022.109231_bib0007
  article-title: SkeleMotion: a new representation of skeleton joint sequences based on motion information for 3D action recognition
– start-page: 1234
  year: 2012
  ident: 10.1016/j.patcog.2022.109231_bib0016
  article-title: Action bank: a high-level representation of activity in video
– volume: 29
  start-page: 9532
  year: 2020
  ident: 10.1016/j.patcog.2022.109231_bib0028
  article-title: Skeleton-based action recognition with multi-stream adaptive graph convolutional networks
  publication-title: IEEE Trans. Image Process.
  doi: 10.1109/TIP.2020.3028207
– volume: Vol. 34
  start-page: 11045
  year: 2020
  ident: 10.1016/j.patcog.2022.109231_bib0034
  article-title: Part-level graph convolutional network for skeleton-based action recognition
– volume: Vol. 34
  start-page: 2669
  year: 2020
  ident: 10.1016/j.patcog.2022.109231_bib0035
  article-title: Learning graph convolutional network for skeleton-based human action recognition by neural searching
– volume: 41
  start-page: 2740
  issue: 11
  year: 2018
  ident: 10.1016/j.patcog.2022.109231_bib0017
  article-title: Temporal segment networks for action recognition in videos
  publication-title: IEEE Trans. Pattern Anal. Mach.Intell.
  doi: 10.1109/TPAMI.2018.2868668
– start-page: 183
  year: 2020
  ident: 10.1016/j.patcog.2022.109231_bib0036
  article-title: Skeleton-based action recognition with shift graph convolutional network
– start-page: 1111
  year: 2020
  ident: 10.1016/j.patcog.2022.109231_bib0001
  article-title: A graph-based interactive reasoning for human-object interaction detection
– year: 2019
  ident: 10.1016/j.patcog.2022.109231_bib0009
  article-title: Actional-structural graph convolutional networks for skeleton-based action recognition
– start-page: 2014
  year: 2016
  ident: 10.1016/j.patcog.2022.109231_bib0024
  article-title: Learning convolutional neural networks for graphs
– volume: 41
  start-page: 1963
  issue: 8
  year: 2019
  ident: 10.1016/j.patcog.2022.109231_bib0032
  article-title: View adaptive neural networks for high performance skeleton-based human action recognition
  publication-title: IEEE Trans. Pattern Anal. Mach.Intell.
  doi: 10.1109/TPAMI.2019.2896631
– start-page: 694
  year: 2021
  ident: 10.1016/j.patcog.2022.109231_bib0022
  article-title: Spatial temporal transformer network for skeleton-based action recognition
– start-page: 7291
  year: 2017
  ident: 10.1016/j.patcog.2022.109231_bib0004
  article-title: Realtime multi-person 2D pose estimation using part affinity fields
– start-page: 7912
  year: 2019
  ident: 10.1016/j.patcog.2022.109231_bib0010
  article-title: Skeleton-based action recognition with directed graph neural networks
– start-page: 500
  year: 2019
  ident: 10.1016/j.patcog.2022.109231_bib0020
  article-title: PAN: persistent appearance network with an efficient motion cue for fast action recognition
– start-page: 2363
  year: 2021
  ident: 10.1016/j.patcog.2022.109231_bib0030
  article-title: Selective spatio-temporal aggregation based pose refinement system: towards understanding human activities in real-world videos
– start-page: 1112
  year: 2020
  ident: 10.1016/j.patcog.2022.109231_bib0037
  article-title: Semantics-guided neural networks for efficient skeleton-based human action recognition
– start-page: 14333
  year: 2020
  ident: 10.1016/j.patcog.2022.109231_bib0025
  article-title: Context aware graph convolution for skeleton-based action recognition
– volume: 107
  start-page: 107511
  year: 2020
  ident: 10.1016/j.patcog.2022.109231_bib0033
  article-title: Skeleton-based action recognition with hierarchical spatial reasoning and temporal stack learning network
  publication-title: Pattern Recognit.
  doi: 10.1016/j.patcog.2020.107511
– volume: 216
  start-page: 103348
  year: 2022
  ident: 10.1016/j.patcog.2022.109231_bib0040
  article-title: Enhanced discriminative graph convolutional network with adaptive temporal modelling for skeleton-based action recognition
  publication-title: Comput. Vis. Image Understanding
  doi: 10.1016/j.cviu.2021.103348
– start-page: 816
  year: 2016
  ident: 10.1016/j.patcog.2022.109231_bib0006
  article-title: Spatio-temporal LSTM with trust gates for 3D human action recognition
– year: 2020
  ident: 10.1016/j.patcog.2022.109231_bib0026
  article-title: Disentangling and unifying graph convolutions for skeleton-based action recognition
SSID ssj0017142
Score 2.5907304
Snippet •The multi-scale representations of the skeleton improve the performance of recognition.•Learning different temporal dynamics according to different action...
SourceID crossref
elsevier
SourceType Enrichment Source
Index Database
Publisher
StartPage 109231
SubjectTerms Action recognition
Graph convolutional network
Skeleton topology
Title SpatioTemporal focus for skeleton-based action recognition
URI https://dx.doi.org/10.1016/j.patcog.2022.109231
Volume 136
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
link http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwnV1LS8NAEB5KvXjxLdZHycHr2mR3k028lWKpir3YQm9hk92VqiSlj6u_3Z1sUhREwUsIYQfCZHYe2W_mA7iOGAu1TYOJ4YoTLqgiMQs0MaH0jQgl86v_HU_jaDTlD7Nw1oJB0wuDsMra9zufXnnr-kmv1mZvMZ9jjy-OHfStRVZxEhvNORdo5TcfW5gH8nu7ieEsILi6aZ-rMF4L6-7KF1slUopzlSgLfg5PX0LO8AD26lzR67vXOYSWLo5gv-Fh8OpteQy3zxUseuKmTL17psw3K3tdeqs3G1WQJBiDlfJcE4O3BQ2VxQlMh3eTwYjUnAgkZ4KuSWaYkQH1M0X9nEvNFFdJnMgAbzWLZGRsARVREedxFBop44TmNDM2y2GZstv7FNpFWegz8DIWS22rpcTYMJ5JYyslobTWWALicWEHWKOKNK8HhiNvxXvaIMNeU6fAFBWYOgV2gGylFm5gxh_rRaPl9NuHT61P_1Xy_N-SF7CLrPEOgHMJ7fVyo69sbrHOupXxdGGnf_84Gn8C-ULNhQ
linkProvider Elsevier
linkToHtml http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwnV3JasMwEB3S5NBeupemqw-9itiS195CaHCa5dIEchOyJZW2wQ5Z_r-SJYcWSgu9GGM8YJ7lWaw3bwAeQkICodJgJH3uIz_CHMXEE0gGzJVRwIhb_e8YT8J05j_Pg3kDenUvjKZVWt9vfHrlre2VjkWzs3x70z2-WnbQVSuyipPBHrS0OlXQhFZ3MEwnu82EyPONaDjxkDaoO-gqmtdSebzyVRWKGGtpJUy8nyPUl6jTP4ZDmy46XfNEJ9AQxSkc1aMYHPtlnsHjS8WMnhqhqYUjy3y7VseVs_5QgUXPCdbxijumj8HZ8YbK4hxm_adpL0V2LALKSYQ3KJNEMg-7Gcdu7jNBuM-TOGGePhUkZKFUNVSIoziPw0AyFic4x5lUiQ7JuPrCL6BZlIW4BCcjMROqYEqkiuQZk6pYirgQQleBesewDaSGguZWM1yPrljQmhz2Tg2AVANIDYBtQDurpdHM-OP-qEaZfnv3VLn1Xy2v_m15D_vpdDyio8FkeA0Heoi84ePcQHOz2opblWpssju7lD4BFfnQNg
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=SpatioTemporal+focus+for+skeleton-based+action+recognition&rft.jtitle=Pattern+recognition&rft.au=Wu%2C+Liyu&rft.au=Zhang%2C+Can&rft.au=Zou%2C+Yuexian&rft.date=2023-04-01&rft.pub=Elsevier+Ltd&rft.issn=0031-3203&rft.eissn=1873-5142&rft.volume=136&rft_id=info:doi/10.1016%2Fj.patcog.2022.109231&rft.externalDocID=S0031320322007105
thumbnail_l http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=0031-3203&client=summon
thumbnail_m http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=0031-3203&client=summon
thumbnail_s http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=0031-3203&client=summon