Enhanced Topology Representation Learning for Skeleton-Based Human Action Recognition
We propose an enhanced topology representation learning method for the Skeleton-Based Human Action Recognition problem. In this work, we investigate the application of an adaptive graph convolutional layer within the Spatial-Temporal Graph Convolutional Network (ST-GCN) to learn a flexible topology...
Saved in:
Published in | Procedia computer science Vol. 246; pp. 3093 - 3102 |
---|---|
Main Authors | , |
Format | Journal Article |
Language | English |
Published |
Elsevier B.V
2024
|
Subjects | |
Online Access | Get full text |
Cover
Loading…
Summary: | We propose an enhanced topology representation learning method for the Skeleton-Based Human Action Recognition problem. In this work, we investigate the application of an adaptive graph convolutional layer within the Spatial-Temporal Graph Convolutional Network (ST-GCN) to learn a flexible topology and enhance representation through regularization loss. We assess the effect of using an adaptive graph, which differs for each input to define the neighbors of a joint, instead of using a fixed heuristic graph. Additionally, by controlling the latent space, our model encodes a more effective latent representation for each action class, which can be easily differentiated by the classifier. Moreover, we evaluate the performance of the proposed method with a three-stream network and explore the potential for improved performance through the use of late fusion ensemble techniques on models trained with different modalities. Our proposal achieved promising results on multiple skeleton-based action recognition benchmarks, with an accuracy of 89.06% on the NTU RGB+D (NTU 60) cross-subject split and 87.89% on the Northwestern-UCLA (NUCLA) dataset, representing approximately 0.5% and 10% improvements over the baseline model on these datasets, respectively. |
---|---|
ISSN: | 1877-0509 1877-0509 |
DOI: | 10.1016/j.procs.2024.09.363 |