自监督的两阶段广义小样本目标检测算法
Two-Stage Self-Supervised Algorithm for Generalized Few-Shot Object Detection
-
摘要: 深度学习技术在目标检测领域取得了巨大进展,但其优异的性能建立在大量精确标注的数据集之上。在样本稀缺的特定领域,如国防海上安全和医学等领域,获取具有标注的数据尤为困难。因此,小样本目标检测领域因其能够应对样本稀疏性所带来的挑战而得到学术界的广泛研究。该领域的研究目标是得到能够从极其有限的样本中提取知识并实现高效目标检测的算法框架。然而,由于新类样本的稀缺性,其与基类之间存在着显著的分布差异,导致了小样本目标检测任务的准确度受限。此外,在对模型应用新类进行微调的过程中,由于新类与基类的不重叠性,模型学习新类的特征知识的过程中会存在大量的梯度更新,导致基类的特征知识被遗忘的问题,从而降低模型的整体性能。针对新类别样本稀缺的问题,本研究采用自监督学习策略。自监督学习,无须依赖标注信息,便于构建代理任务以进行模型训练,是缓解小样本目标检测样本稀缺问题的有效方案。为了避免模型在学习新类特征知识后出现基类灾难性遗忘的问题,本文将自监督学习与两阶段的目标检测器相结合。通过在类别域应用潜在特征来表示各个类别的特征信息,通过动态更新策略在学习新类别的过程中进一步优化特征,并借助检测框域构建良好的代理任务提升回归框的精准度。本研究在PASCAL VOC数据集和MS COCO数据集上进行大量的实验验证,实验结果表明,无论是在新类性能方面还是总体性能方面,本研究所提出的方法相较于其他多个小样本目标检测模型均展现出更加优越的性能表现。Abstract: Deep learning technology has made significant advancements in object detection, largely due to the availability of large-scale, accurately annotated datasets. However, in specific fields such as national defense, maritime security, and medicine, obtaining annotated data can be particularly challenging due to the scarcity of samples. As a result, few-shot object detection, which aims to develop algorithms that can extract knowledge from very limited samples while achieving efficient object detection, has garnered substantial attention in the academic community for its potential to address sample sparsity. One of the main challenges in few-shot object detection is the significant distribution discrepancy between novel and base classes, primarily caused by the limited availability of novel class samples. This discrepancy constrains the accuracy of detection tasks. Additionally, during the fine-tuning process for novel classes, the non-overlapping nature of novel and base classes often leads to drastic gradient updates. Consequently, as the model learns the characteristics of novel classes, it may forget the feature knowledge of base classes, resulting in a decline in overall performance. To tackle the issue of scarce samples for novel classes, this study employs a self-supervised learning strategy. Self-supervised learning does not depend on annotated information and allows for the creation of proxy tasks that facilitate model training, effectively addressing the challenge of sample scarcity in few-shot object detection. To mitigate the problem of catastrophic forgetting base class knowledge after acquiring novel class features, this paper integrates self-supervised learning with a two-stage object detector. By utilizing latent features in the category domain to represent the characteristics of each class and implementing dynamic updating strategies to further refine features during the learning process of new classes, the precision of the regression box is enhanced. This is achieved through the construction of well-designed proxy tasks in the bounding box domain. Extensive experimental validation on the PASCAL VOC and MS COCO datasets demonstrated that the proposed method outperforms various other few-shot object detection models, both in terms of novel class performance and overall efficacy.