强化位置感知的光学与SAR图像一体化配准方法
Integrated Registration Method with Enhanced Position Awareness for Optical and SAR Images
-
摘要: 图像配准是光学与SAR图像信息融合的基础。现有典型的配准方法大多依赖于特征点检测与匹配来实现,对不同场景区域的适用性较差,容易出现误匹配点多或有效同名点不足以致配准失效的情况。针对该问题,本文提出了一种强化位置感知的光学与SAR图像一体化配准方法,利用深度网络直接回归图像间的几何变换关系,在不依赖特征点检测与匹配的情况下,实现端到端的高精度配准。具体地,首先在骨干网络中利用融合坐标注意力的特征提取模块,捕获输入图像对中具有位置敏感性的细粒度特征;其次,融合骨干网络输出的多尺度特征,兼顾浅层特征的定位信息与高层特征的语义信息;最后提出联合位置偏差与图像相似性的损失函数优化配准结果。基于高分辨率光学与SAR图像配准公开数据集OS-Dataset的实验结果表明,与现有典型的OS-SIFT、RIFT2、DHN及DLKFM四种算法相比,所提方法对于城市、农田、河流、重复纹理及弱纹理等不同场景区域均具有良好的稳健性,在配准的目视效果以及定量的精度指标上均优于现有算法。其中平均角点误差小于3个像素的百分比与四种算法中精度最高的DLKFM相比提高了25%以上;配准速度与四种算法中最快的DHN基本相当,可实现高精度、高效率的光学与SAR图像配准。Abstract: Image registration is the basis for optical and SAR image information fusion. Most of the existing typical registration methods rely on feature-point detection and matching. However, because of their poor applicability to different scene regions, these methods are prone to problems such as excessive mismatched points and insufficiently effective matched points, resulting in invalid registration. Therefore, this study investigated an integrated registration method with enhanced position awareness for optical and SAR images. This method utilizes a deep neural network to directly regress the geometric transformation relationship between images. The proposed method achieves end-to-end high-precision registration without relying on feature-point detection. First, a feature-extraction module that integrates coordinate attention is used in the backbone network to extract position-sensitive fine-grained features from the input image pairs. Second, the multiscale features of the backbone network output are fused, taking into account the positional information of low-level features and semantic information of high-level features. Finally, a loss function that combines the position deviation and image similarity is used to optimize the registration results. Experimental results based on a publicly available high-resolution optical and SAR dataset (OS-Dataset) demonstrated that compared with four existing typical algorithms (OS-SIFT, RIFT2, DHN, and DLKFM), the proposed method had good robustness for different scene areas such as urban, farmland, river, repetitive texture, and weak texture scenes, and outperformed the existing algorithms in terms of visual effects and a quantitative precision metric. The percentage of average corner errors of fewer than 3 pixels was more than 25% better than that of DLKFM, which had the highest precision among the four algorithms. The registration speed was comparable to that of DHN, which was the fastest of the four algorithms. The proposed method could achieve high-precision and high-efficiency optical and SAR image registration.