Automatic lipreading using convolutional neural networks and orthogonal moments
Recently, understanding speech from a speaker's mouth using only visual interpretation of the lips movement has become one of the most complex computer vision tasks. In the present paper, we suggest a new approach named Optimized Quaternion Meixner Moments Convolutional Neural Networks (OQMMCNN...
Saved in:
Published in | Mathematical Modeling and Computing Vol. 12; no. 1; pp. 90 - 100 |
---|---|
Main Authors | , , , , |
Format | Journal Article |
Language | English |
Published |
2025
|
Online Access | Get full text |
Cover
Loading…
Summary: | Recently, understanding speech from a speaker's mouth using only visual interpretation of the lips movement has become one of the most complex computer vision tasks. In the present paper, we suggest a new approach named Optimized Quaternion Meixner Moments Convolutional Neural Networks (OQMMCNN) in order to develop a lipreading system based only on video images. This approach is based on Quaternion Meixner Moments (QMMs) that we use as a filter in the Convolutional Neural Networks (CNN) architecture. In addition, we use the Grey Wolf optimization algorithm (GWO) with the aim of ensuring high accuracy of classification through the optimization of the Quaternion Meixner Moments (QMMs) filter local parameters. We show that this method is an effective solution to decrease the high dimensionality of the video images and the training time. This approach is tested on a public dataset and compared to different methods that use complex models and deep architecture in the literature. |
---|---|
ISSN: | 2312-9794 2415-3788 |
DOI: | 10.23939/mmc2025.01.090 |