基于多模态特征融合的凝视雷达目标识别算法

黄仕林; 李苏武; 张月

doi:10.12466/xhcl.2026.05.008

摘要: 全息凝视雷达凭借其独特的收发波束设计与长时间积累检测，能实现对“低慢小”目标的有效探测与可靠识别。针对凝视雷达在目标识别任务中同时具备多普勒信息与航迹信息两个模态的特点，高效提取并融合这两类特征已成为提升目标识别性能的关键需求。本文提出了一种基于图神经网络的多模态特征融合方法。在多普勒特征提取网络中，本文利用了大核感知与动态小核聚焦的多普勒特征提取模块（Large-Small Residual Neural Network， LS-ResNet）提取四种尺度的多普勒特征；而在航迹特征提取过程中，本文采用门控循环单元（Gated Recurrent Unit， GRU）网络对航迹进行时序特征提取，并引入注意力机制以突出关键帧的航迹运动特征；在特征融合阶段，本文基于上述特征提取模块得到的多普勒特征与航迹特征构建图结构，以图神经网络建模特征节点间的复杂关联关系，充分挖掘两类模态之间的互补特征，实现了多普勒与航迹模态的高效融合。基于全息凝视雷达实测场景下“低慢小”目标多普勒与航迹数据，本文设计了多种骨干网络进行对比实验。实验结果表明，仅使用多普勒数据识别时，LS-ResNet达到了96.16%的最佳准确率；仅使用航迹数据识别时，航迹特征提取网络（Attention-based Gated Recurrent Unit， GRU-A）识别准确率为93.97%，优于其他传统时序分类网络；在多模态融合实验中，采用本文所提出的图神经网络融合策略后，准确率提升至98.12%，相比多普勒单模态提高了1.96%，相比航迹单模态提升了4.15%。节点消融实验进一步证明了每个节点特征的重要性。分类混淆矩阵及特征可视化结果进一步说明，相比单一模态，融合特征能够显著减少易混类别的误判。

Abstract: Holographic staring radar enables effective detection and reliable recognition of low-slow-small （LSS） targets owing to its unique transmit-receive beam design and long-term coherent integration. Given that staring radar provides both Doppler information and target trajectory information for recognition tasks， efficiently extracting and fusing these two modalities has become essential to improve recognition performance. In this study， we propose a multimodal feature fusion method based on graph neural networks （GNNs）. For Doppler feature extraction， we employ the LS-ResNet architecture to obtain multi-scale Doppler representations at four different levels. This model incorporates large-kernel perception and dynamic small-kernel focusing. For trajectory feature extraction， a GRU-based network is adopted to model the temporal features of the trajectory sequence， and an attention mechanism is introduced to highlight key frames that contain critical motion information. In the subsequent fusion stage， Doppler and trajectory features produced by the above modules are organized into a graph structure， and a GNN is used to model the complex relationships among feature nodes to fully exploit the complementary information between the two modalities and achieve effective multimodal fusion. To evaluate the proposed approach， we designed multiple backbone networks for comparative experiments based on real-world Doppler and trajectory data of LSS targets collected by holographic staring radar. The experimental results show that the LS-ResNet model achieved the best accuracy of 96.16% with Doppler data alone. In contrast， the GRU-A network attained an accuracy of 93.97% with only trajectory data， which notably outperformed other traditional sequence classification models. The results of experiments with multimodal fusion show that the proposed GNN-based fusion strategy exhibited an improved accuracy of 98.12%， which represents gains of 1.96% over the Doppler-only modality and 4.15% over the trajectory-only modality. The results of ablation experiments further validate the importance of each feature node. In addition， the confusion matrix and feature visualization results demonstrate that the fused features significantly reduced misclassification among easily confused target classes compared with single-modality approaches.

基于多模态特征融合的凝视雷达目标识别算法

Multimodal Feature Fusion for Staring Radar Target Recognition