Multiview meta-metric learning for sign language recognition using triplet loss embeddings

Multiview video processing for recognition is a hard problem if the subject is in continuous motion. Especially the problem becomes even tougher when the subject in question is a human being and the actions to be recognized from the video data are a complex set of actions called sign language. Altho...

Full description

Saved in:
Bibliographic Details
Published inPattern analysis and applications : PAA Vol. 26; no. 3; pp. 1125 - 1141
Main Authors Mopidevi, Suneetha, Prasad, M. V. D., Kishore, Polurie Venkata Vijay
Format Journal Article
LanguageEnglish
Published London Springer London 01.08.2023
Springer Nature B.V
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:Multiview video processing for recognition is a hard problem if the subject is in continuous motion. Especially the problem becomes even tougher when the subject in question is a human being and the actions to be recognized from the video data are a complex set of actions called sign language. Although many deep learning models have been successfully applied for sign language recognition (SLR), very few models have considered multiple views in their training set. In this work, we propose to apply meta-metric learning for video-based SLR. Contrasting to traditional metric learning where the triplet loss is constructed on the sample-based distances, the meta-metric learns on the set-based distances. Consequently, we construct meta-cells on the entire multiview dataset and perform a task-based learning approach with respect to support cells and query sets. Additionally, we propose a maximum view pooled distance on sub-tasks for binding intra class views. Experiments conducted on the multiview sign language dataset and four human action recognition datasets show that the proposed multiview meta-metric learning model (MVDMML) achieves higher accuracies than the baselines.
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
ISSN:1433-7541
1433-755X
DOI:10.1007/s10044-023-01134-2