基于多注意力图的孪生网络视觉目标跟踪

Siamese Network with Multi-Attention Map for Visual Object Tracking

  • 摘要: 在视觉跟踪应用中,目标外观通常由包含目标的矩形区域来建模,这种矩形化边框的描述方式不可避免地引入了背景干扰,并随着场景变化导致跟踪关注点的模糊及歧义,进而产生跟踪漂移。针对以上问题,提出了一种基于多注意力图的孪生网络视觉目标跟踪算法。首先,建立了一种关注于前景目标区域特征表达的孪生网络。该网络通过构建梯度注意力图损失函数项来引导网络训练,提升网络区分目标和干扰背景的能力。此外,嵌入通道注意力和空间注意力进一步强化目标的特征表达,自动发掘有区分的特征表示。在多个公共数据集上的实验验证了提出算法的有效性,以及算法可完成实时的视觉目标跟踪。

     

    Abstract: In visual object tracking, the appearance of the target is usually modeled by a bounding-box containing the target, which inevitably introduces background interference. As the scene changes, the concerns become blurred and ambiguous, and then produces tracking drift. Considering the above problems, a Siamese network with multi-attention map for visual object tracking is proposed. Firstly, a Siamese network focusing on foreground feature representation of target is established. The gradient attention loss function is constructed to guide network training and improve the ability of distinguishing target and interference background. In addition, embedding channel attention and spatial attention further strengthens the feature expression of the target, and automatically discovers the distinguished feature expression. Extensive experiments on benchmark datasets demonstrate that the proposed tracker performs favorably, and its ability to achieve real-time visual object tracking.

     

/

返回文章
返回