Framework for detecting and recognizing sign language using absolute pose estimation difference and deep learning

Computer vision has been identified as one of the key solutions for human activity recognition, including sign language recognition. Despite the success demonstrated by various studies, isolating signs from continuous video remains a challenge. The sliding window approach has been commonly used for...

Full description

Saved in:

Bibliographic Details
Published in	Machine learning with applications Vol. 21; p. 100723
Main Authors	Myagila, Kasian, Nyambo, Devotha Godfrey, Dida, Mussa Ally
Format	Journal Article
Language	English
Published	Elsevier Ltd 01.09.2025
Subjects	Computer vision Continuous signs Gated recurrent unit Isolated signs Pose estimation Sign language Computer vision Sign language Gated recurrent unit Continuous signs Pose estimation Isolated signs
Online Access	Get full text

Cover

Loading…

More Information
Summary:	Computer vision has been identified as one of the key solutions for human activity recognition, including sign language recognition. Despite the success demonstrated by various studies, isolating signs from continuous video remains a challenge. The sliding window approach has been commonly used for translating continuous video. However, this method subjects the model to unnecessary predictions, leading to increased computational costs. This study proposes a framework that use absolute pose estimation differences to isolate signs from continuous videos and translate them using a model trained on isolated signs. Pose estimation features were chosen due to their proven effectiveness in various activity recognition tasks within computer vision. The proposed framework was evaluated on 10 videos of continuous signs. According to the findings, the framework achieved an average accuracy of 84%, while the model itself attained 95% accuracy. Moreover, SoftMax output analysis shows that the model exhibits higher confidence in correctly classified signs, as indicated by higher average SoftMax scores for correct predictions. This study demonstrates the potential of the proposed framework over the sliding window approach, which tends to overwhelm the model with excessive classification sequences. •Propose the framework for developing signs detection and recognition models.•Propose absolute pose estimation difference algorithm to detect articulation.•Evaluate the framework performance using continuous videos.
ISSN:	2666-8270 2666-8270
DOI:	10.1016/j.mlwa.2025.100723