Framework for detecting and recognizing sign language using absolute pose estimation difference and deep learning
Computer vision has been identified as one of the key solutions for human activity recognition, including sign language recognition. Despite the success demonstrated by various studies, isolating signs from continuous video remains a challenge. The sliding window approach has been commonly used for...
Saved in:
Published in | Machine learning with applications Vol. 21; p. 100723 |
---|---|
Main Authors | , , |
Format | Journal Article |
Language | English |
Published |
Elsevier Ltd
01.09.2025
|
Subjects | |
Online Access | Get full text |
Cover
Loading…
Summary: | Computer vision has been identified as one of the key solutions for human activity recognition, including sign language recognition. Despite the success demonstrated by various studies, isolating signs from continuous video remains a challenge. The sliding window approach has been commonly used for translating continuous video. However, this method subjects the model to unnecessary predictions, leading to increased computational costs. This study proposes a framework that use absolute pose estimation differences to isolate signs from continuous videos and translate them using a model trained on isolated signs. Pose estimation features were chosen due to their proven effectiveness in various activity recognition tasks within computer vision. The proposed framework was evaluated on 10 videos of continuous signs. According to the findings, the framework achieved an average accuracy of 84%, while the model itself attained 95% accuracy. Moreover, SoftMax output analysis shows that the model exhibits higher confidence in correctly classified signs, as indicated by higher average SoftMax scores for correct predictions. This study demonstrates the potential of the proposed framework over the sliding window approach, which tends to overwhelm the model with excessive classification sequences.
•Propose the framework for developing signs detection and recognition models.•Propose absolute pose estimation difference algorithm to detect articulation.•Evaluate the framework performance using continuous videos. |
---|---|
ISSN: | 2666-8270 2666-8270 |
DOI: | 10.1016/j.mlwa.2025.100723 |