基于距离控制对比学习的多模态融合舰船识别方法
Multimodal Fusion Ship Identification Method Based on Distance Control Contrast Learning
-
摘要: 多模态融合舰船目标分类旨在通过结合不同传感器的优势,以提高舰船目标识别的准确性与鲁棒性。传统的单模态识别方法通常依赖于单一传感器的特征,虽然在特定条件下可达到较高的识别准确率,但其局限性在于无法全面获取目标的多维信息,且在不同识别场景下特征表现存在变化。多模态融合方法虽能弥补这一不足,但现有方法多采用特征拼接或加权融合的方式,未能深入挖掘不同模态特征间的互补性和相关性,也未充分考虑目标距离对特征提取的影响,导致融合效果有限。为解决上述问题,本文提出一种基于距离控制对比学习的多模态融合舰船识别方法,以提升多模态融合舰船目标识别的性能。该方法设计了三个新的模块以充分利用来自两种传感器的特征信息,进一步提高融合效果。首先,设计了距离控制模块,用于指导网络在不同距离条件下对特征向量的使用方式,优化了各种距离情况下的特征融合过程。其次,设计了解耦对比损失模块,以保留模态特征的独特信息,增强特征间的互补性。最后,设计了监督对比损失模块,用于捕捉同一类别目标融合特征中的共性信息,从而构建模态特征之间的相关性。为验证所提方法的有效性,本文基于实测数据进行了实验,针对不同距离的舰船目标进行测试。实验结果表明,所提出的基于距离控制的雷达-红外特征融合网络相比单模识别方法,在多种距离条件下对舰船的识别准确率提升9%以上,相较于已有融合识别方法识别准确率提升了4.65%。该方法不仅有效提升了舰船目标的识别准确率,还能够灵活应对不同距离下的特征变化,展示了其在多模态融合舰船目标分类中的优越性。Abstract: The objective of this study was to improve ship target recognition performance through a multimodal fusion approach, leveraging the strengths of different sensors. Traditional unimodal recognition methods typically rely on a single sensor’s features and achieve high accuracy under specific conditions. However, these methods are limited in their ability to capture multidimensional information about the target and exhibit variability in feature performance across different recognition scenarios. While multimodal fusion approaches can address these shortcomings, most existing methods primarily use feature concatenation or weighted fusion, which fail to fully exploit the complementarity and correlation between features from different modalities. Additionally, they do not adequately consider the impact of target distance on feature extraction, which limits their fusion effectiveness. To address these issues, this study proposed a multimodal fusion ship recognition method based on distance-controlled contrastive learning to enhance the performance of multimodal fusion in ship target recognition. The proposed method involves three novel modules to fully utilize the feature information from both sensors, thereby enhancing the fusion process. First, a distance control module was designed to guide the network in the usage of feature vectors under varying distance conditions, thus optimizing the feature fusion process for different distances. Second, a decoupled contrastive loss module was introduced to preserve the unique information of each modality, thus improving the complementarity between features. Finally, a supervised contrastive loss module was implemented to capture common information from the fused features of targets belonging to the same category, thus establishing the correlation between modality-specific features. The method was validated through experiments using real-world data, with ship targets tested at different distances. The experimental results demonstrated that the proposed distance-controlled radar-infrared feature fusion network outperformed unimodal recognition methods, improving ship recognition accuracy by more than 9% across various distances. Furthermore, it achieved a 4.65% improvement in recognition accuracy compared with existing fusion-based recognition methods. The proposed approach not only enhanced ship target recognition accuracy, but also adapted flexibly to feature variations at different distances, thus demonstrating its superiority in multimodal fusion ship target classification. In conclusion, the proposed method effectively addresses key challenges in multimodal fusion by incorporating distance control, ensuring optimal feature fusion across varying conditions. The use of decoupled contrastive loss helps preserve modality-specific features, while benefiting from the complementary information provided by the other modality. Additionally, the supervised contrastive loss module strengthens the correlation between similar features across modalities, thus improving the classification accuracy. This combination of modules results in a robust and adaptable fusion network, offering significant improvements over existing methods and making it highly effective for multimodal fusion ship target classification.
下载: