胡卫杰, 刘颖冰, 马飞, 等. 基于无损压缩和量化感知的 SAR 舰船检测网络边缘部署[J]. 信号处理, 2024, 40(9): 1674-1684. DOI: 10.12466/xhcl.2024.09.009.
引用本文: 胡卫杰, 刘颖冰, 马飞, 等. 基于无损压缩和量化感知的 SAR 舰船检测网络边缘部署[J]. 信号处理, 2024, 40(9): 1674-1684. DOI: 10.12466/xhcl.2024.09.009.
HU Weijie, LIU Yingbing, MA Fei, et al. Edge deployment of SAR ship detection network based on lossless model compression and quantization-aware training[J]. Journal of Signal Processing, 2024, 40(9): 1674-1684. DOI: 10.12466/xhcl.2024.09.009.
Citation: HU Weijie, LIU Yingbing, MA Fei, et al. Edge deployment of SAR ship detection network based on lossless model compression and quantization-aware training[J]. Journal of Signal Processing, 2024, 40(9): 1674-1684. DOI: 10.12466/xhcl.2024.09.009.

基于无损压缩和量化感知的SAR舰船检测网络边缘部署

Edge Deployment of SAR Ship Detection Network Based on Lossless Model Compression and Quantization-Aware Training

  • 摘要: 基于深度神经网络的方法在合成孔径雷达(Synthetic Aperture Radar,SAR)图像舰船目标检测任务上展现出巨大优势,但是庞大的参数量和算力需求导致其难以在资源受限的边缘环境下部署。针对该问题,本文从网络轻量化和模型部署优化两个层面对单阶段目标检测网络YOLO(You Only Look Once)v5s进行改进,提出了面向边缘环境的SAR图像舰船目标检测网络部署方法。在网络轻量化层面,本文联合基于批归一化层缩放因子的通道级网络剪枝和基于特征响应的细粒度知识蒸馏实现了舰船检测网络的无损压缩。轻量化模型的参数量和计算量相较于基线分别下降了80.3%和51.3%,并且没有引起检测精度的损失,在SAR图像舰船检测数据集(SAR Ship Detection Dataset,SSDD)上的平均准确率为0.979(基线为0.980)。在模型部署优化层面,本文基于嵌入式GPU(Graphic Process Unit)提出了量化感知训练指导的混合精度TensorRT(Tensor Real-Time)推理引擎,大幅提升模型推理速度的同时降低了设备的运行功耗。轻量化推理引擎在尺寸为640×640 pixels的SAR图像上的推理速度为208帧每秒,达到了基线的3.41倍,同时设备的推理功耗仅6.2 W,相比基线下降了61.0%。另外,得益于量化感知训练,混合精度TensorRT推理引擎在取得与8位整型精度TensorRT推理引擎相似的推理速度和功耗的同时,平均准确率提升了44.1%,仅比基线下降了0.9%。试验数据证明,本文所提方法能够很好地兼顾边缘环境下SAR图像舰船目标检测的实时性、精准性和低功耗特性等要求。

     

    Abstract: ‍ ‍Methods based on deep neural networks have shown great advantages in the task of detecting ship targets in synthetic aperture radar (SAR) images. However, the large number of parameters and computing power requirements make it difficult to deploy in edge environments with limited resources. To address this problem, we improved the single-stage target detection network known as You Only Look Once (YOLO) v5s from two aspects: network lightweighting and model deployment optimization, and we propose a SAR image ship target detection network deployment method for edge environments. In terms of network lightweighting, we combined channel-level network pruning based on scaling factors of batch normalization layer and fine-grained knowledge distillation based on feature responses to achieve lossless compression of the ship-detection network. The parameter amount and calculation amount of the lightweight model decreased by 80.3% and 51.3%, respectively, compared with the baseline, without causing a loss in detection accuracy. The average accuracy on the SAR ship detection dataset was 0.979 (the baseline was 0.980). For model deployment optimization, we propose a mixed-precision tensor real-time (TensorRT) inference engine guided by quantization-aware training based on an embedded GPU, which greatly improved the model inference speed and reduced the operating power consumption of the device. The inference speed of the lightweight inference engine on an SAR image with a size of 640 × 640 pixels was 208 frames per second, reaching 3.41 times that of the baseline. Simultaneously, the inference power consumption of the device was only 6.2 W, which was a 61.0% decrease compared to that of the baseline. In addition, benefiting from quantization-aware training, the mixed-precision TensorRT inference engine achieved similar inference speed and power consumption as the 8-bit integer precision TensorRT inference engine; however, with an increase of 44.1% in the average accuracy, which was only 0.9% lower than the baseline value. Experimental data showed that the method proposed in this article can well take into account the requirements of real-time measurements, accuracy, and low power consumption for ship target detection in SAR images in edge environments.

     

/

返回文章
返回