An efficient framework for few-shot skeleton-based temporal action segmentation

Temporal action segmentation (TAS) aims to classify and locate actions in the long untrimmed action sequence. With the success of deep learning, many deep models for action segmentation have emerged. However, few-shot TAS is still a challenging problem. This study proposes an efficient framework for...

Full description

Saved in:

Bibliographic Details
Published in	Computer vision and image understanding Vol. 232; p. 103707
Main Authors	Xu, Leiyang, Wang, Qiang, Lin, Xiaotian, Yuan, Lin
Format	Journal Article
Language	English
Published	Elsevier Inc 01.07.2023
Subjects	Connectionist temporal classification Data segmentation Synthetic action sequences Temporal action segmentation Temporal action segmentation Connectionist temporal classification Data segmentation Synthetic action sequences
Online Access	Get full text

Cover

Loading…

More Information
Summary:	Temporal action segmentation (TAS) aims to classify and locate actions in the long untrimmed action sequence. With the success of deep learning, many deep models for action segmentation have emerged. However, few-shot TAS is still a challenging problem. This study proposes an efficient framework for the few-shot skeleton-based TAS, including a data augmentation method and an improved model. The data augmentation approach based on motion interpolation is presented here to solve the problem of insufficient data, and can increase the number of samples significantly by synthesizing action sequences. Besides, we concatenate a Connectionist Temporal Classification (CTC) layer with a network designed for skeleton-based TAS to obtain an optimized model. Leveraging CTC can enhance the temporal alignment between prediction and ground truth and further improve the segment-wise metrics of segmentation results. Extensive experiments on both public and self-constructed datasets, including two small-scale datasets and one large-scale dataset, show the effectiveness of two proposed methods in improving the performance of the few-shot skeleton-based TAS task. •An efficient framework for few-shot skeleton-based temporal action segmentation.•Synthesize new labeled action sequences by motion interpolation.•A data augmentation method to increase the amount of training data for deep models.•Introduce CTC loss to strengthen the alignment of action segments.
ISSN:	1077-3142 1090-235X
DOI:	10.1016/j.cviu.2023.103707