基于自适应多尺度特征融合的X光违禁品检测

X-ray Prohibited-Item Detection Based on Adaptive Multi-Scale Feature Fusion

  • 摘要: X光图像违禁品检测是一项极其重要的工作,可以在机场、车站等公共场所检测出各种危险物品,防止事故发生,保护旅客安全。然而,由于X光图像背景复杂、目标尺度变化大等问题,传统的检测算法难以实现准确的检测。因此,针对X光图像背景复杂、违禁品尺度变化大等问题,同时考虑到实际检测场景中对于模型性能和运行速度的要求,本文对YOLOv5s网络模型进行改进。首先,为了提升网络对全局特征的提取能力,在主干网络引入Transformer模块,依赖其全局建模能力,弥补局部信息的不足; 然后,为了更加准确地检测X光图像中不同尺度的违禁品,结合空洞卷积、CBAM(Convolutional Block Attention Module)设计感受野自适应融合模块,对不同尺度感受野信息进行合理分配,提升对于背景复杂下不同尺度违禁品的检测精度,使模型可以更好地适应不同的任务场景;最后,在模型中使用优化的DIoU(EDIoU)边框回归损失函数,在DIoU中引入惩罚权重φ,在缩短模型的训练时间,减小边框损失误差的同时,进一步提高模型对违禁品的检测精度。为了验证本文提出的优化方法的可行性,优化后的YOLOv5s模型在实验室自制的数据集SIXray_OD上进行验证,实验结果表明,优化后的模型检测平均精度提升到89.8%,较原模型提升0.9%。

     

    Abstract: ‍ ‍X-ray prohibited-item detection is an extremely important work and is used to detect a variety of dangerous goods in airports, stations, and other public places to prevent accidents and protect the safety of passengers. However, a complex background for an X-ray image and large change in the target scale make it difficult to achieve sufficient detection accuracy with the traditional detection algorithm. The YOLOv5s network model was improved with the goal of improving the performance with a complicated background for an X-ray image or large changes in the scale of prohibited items, while considering the model-performance and running-speed requirements in actual detection scenarios. First, in order to enhance the global modeling ability of the network, a transformer was introduced into the trunk network and its global modeling ability was used to improve the trunk network's ability to extract global information and make up for the shortage of local information. Then, in order to more accurately detect prohibited items on different scales in the X-ray images, we designed a multi-scale wide receptive field adaptive fusion module based on cavity convolution and a convolutional block attention module (CBAM) to reasonably allocate the receptive field information with different scales. This improved the detection accuracy for different prohibited items with a complex background, which allowed the model to better adapt to different task scenarios. Finally, the optimized DIoU (EDIoU) frame regression loss function was used to introduce penalty weight φ into DIoU, which not only shortened the training time of the model and reduced the frame loss error, but also further improved the detection accuracy for prohibited items. In order to verify the feasibility of the optimization method proposed in this paper, the proposed optimized model of YOLOv5s was verified on the self-made dataset SIXray_OD in the laboratory. The experimental results showed an average detection accuracy of 89.8% for the optimized model, which was 0.9% better than that of the original model.

     

/

返回文章
返回