Multiview meta-metric learning for sign language recognition using triplet loss embeddings
Multiview video processing for recognition is a hard problem if the subject is in continuous motion. Especially the problem becomes even tougher when the subject in question is a human being and the actions to be recognized from the video data are a complex set of actions called sign language. Altho...
Saved in:
Published in | Pattern analysis and applications : PAA Vol. 26; no. 3; pp. 1125 - 1141 |
---|---|
Main Authors | , , |
Format | Journal Article |
Language | English |
Published |
London
Springer London
01.08.2023
Springer Nature B.V |
Subjects | |
Online Access | Get full text |
Cover
Loading…
Summary: | Multiview video processing for recognition is a hard problem if the subject is in continuous motion. Especially the problem becomes even tougher when the subject in question is a human being and the actions to be recognized from the video data are a complex set of actions called sign language. Although many deep learning models have been successfully applied for sign language recognition (SLR), very few models have considered multiple views in their training set. In this work, we propose to apply meta-metric learning for video-based SLR. Contrasting to traditional metric learning where the triplet loss is constructed on the sample-based distances, the meta-metric learns on the set-based distances. Consequently, we construct meta-cells on the entire multiview dataset and perform a task-based learning approach with respect to support cells and query sets. Additionally, we propose a maximum view pooled distance on sub-tasks for binding intra class views. Experiments conducted on the multiview sign language dataset and four human action recognition datasets show that the proposed multiview meta-metric learning model (MVDMML) achieves higher accuracies than the baselines. |
---|---|
Bibliography: | ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14 |
ISSN: | 1433-7541 1433-755X |
DOI: | 10.1007/s10044-023-01134-2 |