A mmWave MIMO Radar-Based Gesture Recognition Using Fusion of Range, Velocity, and Angular Information

Radar sensing technology offers an innovative approach to human-computer interaction, distinguished by its robust sensing capabilities impervious to acoustic and optical disturbances, thus presenting a superior alternative for reliable user engagement. Gesture recognition with millimeter wave (mmWav...

Full description

Saved in:
Bibliographic Details
Published inIEEE sensors journal Vol. 24; no. 6; pp. 9124 - 9134
Main Authors Yu, Jih-Tsun, Tseng, Yen-Hsiang, Tseng, Po-Hsuan
Format Journal Article
LanguageEnglish
Published New York IEEE 15.03.2024
The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:Radar sensing technology offers an innovative approach to human-computer interaction, distinguished by its robust sensing capabilities impervious to acoustic and optical disturbances, thus presenting a superior alternative for reliable user engagement. Gesture recognition with millimeter wave (mmWave) frequency-modulated continuous wave (FMCW) radar extracts the range and velocity from the raw data, such as range-Doppler image (RDI). Besides, the angle estimated using a multiple-input-multiple-output (MIMO) radar also contains rich gesture information. Thus, to leverage the MIMO radar technique, we design the usage of the azimuth/elevation-based range-angle images (RAIs) with the average over slow time with the RDI as the spectrum map input for gesture recognition. Since gesture motion is related to the track of position and velocity, we utilize feature extraction on convolutional neural networks (CNNs) of two different spectrum maps, each time sequence learning by cascading long short-term memory (LSTM), and then fuse the two networks in the end to recognize hand gestures. We validate the proposed scheme based on hand gestures collected by several subjects in different rooms using the 77-GHz mmWave radar from Texas Instrument (TI). With the various combinations of antennas, we observed that the higher angular resolution provided by azimuth and elevation angles could help to enhance the machine-learning model's ability. By utilizing angle alongside velocity information, the late fusion network provides a classification accuracy of 94.67% and 97.43% among 12 gestures in terms of per frame and sequence accuracy.
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
ISSN:1530-437X
1558-1748
DOI:10.1109/JSEN.2024.3355395