‍ ZHU Qingxi, FAN Xiaodong, LIU Jiangang, et al. Hand-gesture recognition for multi-channel millimeter-wave radar based on multi-resolution and multi-presentation fusion[J]. Journal of Signal Processing, 2024, 40(12): 2151-2164. DOI: 10.12466/xhcl.2024.12.005.
Citation: ‍ ZHU Qingxi, FAN Xiaodong, LIU Jiangang, et al. Hand-gesture recognition for multi-channel millimeter-wave radar based on multi-resolution and multi-presentation fusion[J]. Journal of Signal Processing, 2024, 40(12): 2151-2164. DOI: 10.12466/xhcl.2024.12.005.

Hand-Gesture Recognition for Multi-channel Millimeter-Wave Radar Based on Multi-Resolution and Multi-Presentation Fusion

  • ‍ ‍With the accelerated popularization of smart devices and the rapid development of technology, gesture recognition technology underpins its application potential and broad market prospect in the fields of smart home and smart driving. In these fields, the key problem of gesture recognition considered maintaining the efficient and accurate recognition capability in the face of different users, different orientations, and confusing gesture features. An innovative gesture recognition method based on a convolutional neural network with multi-domain representation and multi-resolution fusion was proposed in this study to solve the problems of low recognition rate of confusing gestures and underutilization of feature information. The method aimed at complementing and optimizing gesture features. First, three feature expression images in different domains, i.e., time-frequency, time-distance, and time-angle domains, were formed by Short-Time Fourier Transform (STFT), Two-Dimensional Fast Fourier Transform (2D-FFT), and Minimum Variance Distortionless Response (MVDR) beams to form three feature expression images in different domains, i.e., time-frequency domain, time-distance domain, and time-angle domain images. For the three feature expression images, a composite neural network comprising three parallel 2-Dimensional Convolutional Neural Networks (2DCNNs) connected in series with a Multi-Resolution Fusion Module (MRFM) was designed for extracting the features from the images. 2DCNNs were used in tandem with MRFM for extracting gesture features from images and recognizing them. Finally, a multi-domain feature representation image dataset containing seven types of gestures was created to train and test the model, and the test results showed that under gesture recognition scenarios with different users, different locations, and different environments, the proposed method in this paper improved the average recognition accuracy of the seven confusing gestures by at least 2.3% compared to the model without the MRFM. The model improved the recognition accuracy by at least 4.6% compared to the model using only a single category representation.
  • loading

Catalog

    /

    DownLoad:  Full-Size Img  PowerPoint
    Return
    Return