基于自监督对比学习的深度神经网络对抗鲁棒性提升

Self-supervised Contrastive Learning for Improving the Adversarial Robustness of Deep Neural Networks

  • 摘要: 基于深度神经网络的多源图像内容自动分析与目标识别方法近年来不断取得新的突破,并逐步在智能安防、医疗影像辅助诊断和自动驾驶等多个领域得到广泛部署。然而深度神经网络的对抗脆弱性给其在安全敏感领域的部署带来巨大安全隐患。对抗鲁棒性的有效提升方法是采用最大化网络损失的对抗样本重训练深度网络,但是现有的对抗训练过程生成对抗样本时需要类别标记信息,并且会大大降低无攻击数据集上的泛化性能。本文提出一种基于自监督对比学习的深度神经网络对抗鲁棒性提升方法,充分利用大量存在的无标记数据改善模型在对抗场景中的预测稳定性和泛化性。采用孪生网络架构,最大化训练样本与其无监督对抗样本间的多隐层表征相似性,增强模型的内在鲁棒性。本文所提方法可以用于预训练模型的鲁棒性提升,也可以与对抗训练相结合最大化模型的“预训练+微调”鲁棒性,在遥感图像场景分类数据集上的实验结果证明了所提方法的有效性和灵活性。

     

    Abstract: Recently deep neural networks have achieved great success in various multiple source digital image analysis and interpretation tasks. They have been gradually deployed in many applications such as smart surveillance, medical image analysis and autonomous driving. However, they are vulnerable to adversarial attacks. One of the most effective method for adversarial robustness enhancement is to retrain deep neural network using adversarial examples which maximize the loss function of the deep model. Yet it requires semantic annotation information to generate adversarial attacks and often perform poorly on the original data set. This paper proposes a self-supervised contrastive learning framework to adversarially train a robust neural network without labeled data. We aim to maximize the representation similarity between a random augmentation of an image and its instance level adversarial perturbation. Specifically, it relies on two neural networks that interact and learn from each other. The proposed method can be used to improve the adversarial robustness of pre-trained models, and can also be used to enhance the two stage robustness. We validate the proposed method on two remote sensing scene classification benchmark data sets.

     

/

返回文章
返回