基于通道注意力机制的单目深度估计
Multi-scale Monocular Depth Estimation Network Based on Channel Attention
-
摘要: 现有单目深度估计(Monocular depth estimation)算法存在细节估计不准确、同一平面距离估计错误的问题。深度信息是通过图像像素的三通道信息估计出来的,目前已有的算法中很少考虑特征图通道之间的相互关系对深度信息的影响。因此本文提出了一种SE-DenseDepth网络,在网络的编码器中嵌入通道注意力机制,依据不同通道对深度信息的贡献度差异,对通道进行编码,提高编码器对图像特征的表征能力。为了获得图像精细的深度信息,网络建立编码器到解码器的跳连接,引入了更多的低层信息。本文在通用室内数据集NYU-Depth V2上训练,并在真实数据上测试。实验结果表明,本文提出的方法在深度突然变化的细节区域表现更好,在远距离大平面的情况下不会出现深度的断层,与其他主流算法相比可以获得较好的深度估计性能。Abstract: Existing monocular depth estimation algorithms have suffered from inaccurate detail estimation and incorrect estimation of distances in the same plane. The depth information is estimated from the three-channel information of the image pixels, but the influence of the interrelationship among the feature map channels on the depth information is rarely considered in the currently available algorithms. Therefore, this paper proposes the SE-DenseDepth network, which embeds a channel attention mechanism in the encoder of the network to encode channels based on the difference in the contribution of different channels to the depth information to improve the encoder's ability to characterize image features. To obtain the detailed depth information of the image, the network establishes an encoder-to-decoder skip connection that introduces more low-level information. In this paper, we train on the generic indoor dataset NYU-Depth V2 and test on real data. Experimental results show that the proposed method can estimate the depth more accurately in regions where the depth changes abruptly. Meanwhile, the situation where depth estimation in a large plan suddenly changes will not occur. Compared with other mainstream algorithms, the proposed method can achieve better depth estimation performance.