YOLOv5与Deep-SORT联合优化的无人机多目标跟踪算法

罗茜, 赵睿, 庄慧珊, 罗宏刚

罗茜, 赵睿, 庄慧珊, 罗宏刚. YOLOv5与Deep-SORT联合优化的无人机多目标跟踪算法[J]. 信号处理, 2022, 38(12): 2628-2638. DOI: 10.16798/j.issn.1003-0530.2022.12.017
引用本文: 罗茜, 赵睿, 庄慧珊, 罗宏刚. YOLOv5与Deep-SORT联合优化的无人机多目标跟踪算法[J]. 信号处理, 2022, 38(12): 2628-2638. DOI: 10.16798/j.issn.1003-0530.2022.12.017
LUO Xi, ZHAO Rui, ZHUANG Huishan, LUO Honggang. UAV Multi-Target Tracking Algorithm Jointly Optimized by YOLOv5 and Deep-SORT[J]. JOURNAL OF SIGNAL PROCESSING, 2022, 38(12): 2628-2638. DOI: 10.16798/j.issn.1003-0530.2022.12.017
Citation: LUO Xi, ZHAO Rui, ZHUANG Huishan, LUO Honggang. UAV Multi-Target Tracking Algorithm Jointly Optimized by YOLOv5 and Deep-SORT[J]. JOURNAL OF SIGNAL PROCESSING, 2022, 38(12): 2628-2638. DOI: 10.16798/j.issn.1003-0530.2022.12.017

YOLOv5与Deep-SORT联合优化的无人机多目标跟踪算法

基金项目: 

福建省自然科学基金项目 2019J01055

详细信息
    作者简介:

    罗 茜 女,1998年生,江西宜春人。华侨大学信息科学与工程学院硕士研究生,主要研究方向为无线通信与目标跟踪。E-mail:xiluo@stu.hqu.edu.cn

    赵 睿 男,1980年生,江苏扬州人。华侨大学信息科学与工程学院副教授,博士,通信工程系主任,IEEE会员,近年主要研究方向为通信信号处理和机器学习。E-mail:rzhao@hqu.edu.cn

    庄慧珊 女,1996年生,福建莆田人。华侨大学信息科学与工程学院硕士研究生,主要研究方向为大数据挖掘。E-mail:906931238@qq.com

    罗宏刚 男,1996年生,山西晋中人。华侨大学信息科学与工程学院硕士研究生,主要研究方向为时序InSAR城市沉降监测。E-mail:1274722836@qq.com

UAV Multi-Target Tracking Algorithm Jointly Optimized by YOLOv5 and Deep-SORT

  • 摘要: 针对无人机平台下小目标检测性能差、目标尺度变化较大、复杂背景干扰等导致跟踪失败的问题,该文提出一种联合优化检测器YOLOv5(You Only Look Once)和Deep-SORT(Simple Online and Realtime Tracking with a Deep Association Metric)的无人机多目标跟踪算法。该算法使用改进的CSPDarknet53(Cross Stage Paritial Darknet53)骨干网络重新构建检测器中的特征提取模块,同时通过自顶向下和自底向上的双向融合网络设计小目标检测层,采用无人机航拍数据集训练更新优化后的目标检测网络模型,解决小目标检测性能差问题;在跟踪模块中,提出结合时空注意力模块的残差网络作为特征提取网络,加强网络感知微小外观特征及抗干扰的能力,最后采用三元组损失函数加强神经网络区分类内差异的能力。实验结果表明,优化后的目标检测的平均检测精度相比于原始YOLOv5提升了11%,在UAVDT数据集上相较于原始跟踪算法准确率与精度分别提高了13.288%、3.968%,有效减少目标身份切换频次。
    Abstract: ‍ ‍Aiming at the problems of tracking failure caused by poor detection performance of small targets, large target scale changes, and complex background interference under the unmanned aerial vehicle platform, this paper proposed an unmanned aerial vehicle multi-target tracking algorithm that jointly optimized YOLOv5 (You Only Look Once) and Deep-SORT (Simple Online and Realtime Tracking with a Deep Association Metric). The algorithm used the improved CSPDarknet53 (Cross Stage Paritial Darknet53) backbone network to reconstruct the feature extraction module in the detector. At the same time, the small target detection layer was designed by the top-down and bottom-up bidirectional fusion network. In the meanwhile, the optimized target detection network model was trained by the unmanned aerial vehicle aerial photography dataset, which solved the problem of poor detection performance of small targets. As for the tracking module, a residual network combined with the spatiotemporal attention module was proposed as a feature extraction network to enhance the network's ability to perceive small appearance features and anti-interference. Finally, the triple loss function was used to strengthen the ability of the neural network to distinguish within-class differences. The experimental results show that the average detection accuracy of the optimized target detection is improved by 11% compared with the original YOLOv5, and the accuracy and precision of the UAVDT(The Unmanned Aerial Vehicle Benchmark: Object Detection and Tracking) data set are improved by 13.288% and 3.968% respectively compared with the original tracking algorithm, effectively reducing the target. Identity switching frequency.
  • 图  1   算法结构框架图

    Figure  1.   Algorithm structure framework diagram

    图  2   优化目标检测网络结构图

    Figure  2.   Optimization target detection network structure diagram

    图  3   跟踪效果对比结果图

    Figure  3.   Tracking effect comparison results

    图  4   复杂场景跟踪结果图

    Figure  4.   Complex scene tracking results

    表  1   YOLOv5改进前后性能对比结果

    Table  1   Performance comparison results before and after YOLOv5 improvement

    Network modelsCar AP(%)Bus AP(%)Truck AP(%)mAP(%)mAP@.5:.95(%)
    YOLOv564.542.114.321.010.3
    YOLOv5_172.953.135.132.017.1
    下载: 导出CSV

    表  2   本文算法在UAVDT数据集中的跟踪结果

    Table  2   Tracking results of the algorithm in UAVDT dataset

    Tracking ModelMOTA(%)MOTP(%)FN(帧)FP(帧)IDs
    YOLOv5+Deep-SORT23.23771.3321028401577501096
    YOLOv5_1+Deep-SORT15.69270.89487247199230934
    YOLOv5+Deep-SORT131.21771.541120900112720872
    ours36.52575.30075750140640819
    下载: 导出CSV

    表  3   目标跟踪算法性能对比结果

    Table  3   Target tracking algorithm performance comparison results

    TrackingModelMOTA(%)MOTP(%)FN(帧)FP(帧)IDs
    CMOT[4]27.17875.112146420989152920
    SORT[9]33.16376.71166490574403918
    ours36.52575.30075750140640819
    下载: 导出CSV
  • [1]

    GAO Ming,JIN Lisheng,JIANG Yuying,et al. Manifold Siamese network:A novel visual tracking ConvNet for autonomous vehicles[J]. IEEE Transactions on Intelligent Transportation Systems,2020,21(4):1612- 1623. doi:10.1109/tits.2019.2930337 doi: 10.1109/tits.2019.2930337

    [2]

    YFANTIS E A. A UAV with autonomy,pattern recognition for forest fire prevention,and AI for providing advice to firefighters fighting forest fires[C]// 2019 IEEE 9th Annual Computing and Communication Workshop and Conference. Las Vegas,NV,USA. IEEE,2019:409- 413. doi:10.1109/ccwc.2019.8666471 doi: 10.1109/ccwc.2019.8666471

    [3] 杨建秀,谢雪梅,石光明,等. 特征信息增强的无人机车辆实时检测算法[J]. 信号处理,2022,38(5):901- 914.

    YANG Jianxiu,XIE Xuemei,SHI Guangming,et al. Real-time UAV vehicle detection based on enhanced feature information[J]. Journal of Signal Processing,2022,38(5):901- 914.(in Chinese)

    [4]

    BAE S H,YOON K J. Robust online multi-object tracking based on tracklet confidence and online discriminative appearance learning[C]// 2014 IEEE Conference on Computer Vision and Pattern Recognition. Columbus,OH,USA. IEEE,2014:1218- 1225. doi:10.1109/cvpr.2014.159 doi: 10.1109/cvpr.2014.159

    [5]

    Al-SHAKARJI N M,BUNYAK F,SEETHARAMAN G,et al. Multi-object tracking cascade with multi-step data association and occlusion handling[C]// 2018 15th IEEE International Conference on Advanced Video and Signal Based Surveillance(AVSS). Auckland,New Zealand. IEEE,2018:1- 6. doi:10.1109/avss.2018.8639321 doi: 10.1109/avss.2018.8639321

    [6]

    JIN J,LI X,LI X,et al. Online multi-object tracking with Siamese network and optical flow[C]// 2020 IEEE 5th International Conference on Image,Vision and Computing(ICIVC). Beijing,China. IEEE,2020:193- 198. doi:10.1109/icivc50857.2020.9177480 doi: 10.1109/icivc50857.2020.9177480

    [7]

    ZHANG Yifu,SUN Peize,JIANG Yi,et al. ByteTrack:multi-object tracking by associating every detection box[EB/OL]. 2021:arXiv:2110.06864[cs.CV]. https://doi.org/10.48550/arXiv.2110.06864. doi: 10.48550/arXiv.2110.06864

    [8]

    KIM C,LI Fuxin,CIPTADI A,et al. Multiple hypothesis tracking revisited[C]// 2015 IEEE International Conference on Computer Vision. Santiago,Chile. IEEE,2015:4696- 4704. doi:10.1109/iccv.2015.533 doi: 10.1109/iccv.2015.533

    [9]

    BEWLEY A,GE Zongyuan,OTT L,et al. Simple online and realtime tracking[C]// 2016 IEEE International Conference on Image Processing. Phoenix,AZ,USA. IEEE,2016:3464- 3468. doi:10.1109/icip.2016.7533003 doi: 10.1109/icip.2016.7533003

    [10]

    WANG Z,ZHENG L,LIU Y,et al. Towards real-time multi-object tracking[C]// European Conference on Computer Vision. Glasgow US:Springer,2020:107- 122. doi:10.1007/978-3-030-58621-8_7 doi: 10.1007/978-3-030-58621-8_7

    [11]

    ZENG F,DONG B,WANG T,et al. Motr:End-to-end multiple-object tracking with transformer[EB/OL]. 2021:arXiv:2105.03247[cs.CV]. https://doi.org/10.48550/arXiv.2105.03247. doi: 10.48550/arXiv.2105.03247

    [12]

    CAI J,XU M,LI W,et al. MeMOT:Multi-object tracking with memory[C]// Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. New Orleans,USA. IEEE,2022:8090- 8100. doi:10.1109/cvpr52688.2022.00792 doi: 10.1109/cvpr52688.2022.00792

    [13]

    ZHOU X,YIN T,KOLTUN V,et al. Global tracking transformers[C]// Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. New Orleans,USA. IEEE,2022:8771- 8780. doi:10.1109/cvpr52688.2022.00857 doi: 10.1109/cvpr52688.2022.00857

    [14]

    ZHU P,WEN L,DU D,et al. Detection and tracking meet drones challenge[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence,2021,43(10):1- 1.

    [15]

    DU D,QI Y,YU H,et al. The unmanned aerial vehicle benchmark:Object detection and tracking[C]// Proceedings of the European Conference on Computer Vision(ECCV). Munich,Germany:Springer,2018:370- 386. doi:10.1007/978-3-030-01249-6_23 doi: 10.1007/978-3-030-01249-6_23

    [16]

    REN Shaoqing,HE Kaiming,GIRSHICK R,et al. Faster R-CNN:Towards real-time object detection with region proposal networks[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence,2017,39(6):1137- 1149. doi:10.1109/tpami.2016.2577031 doi: 10.1109/tpami.2016.2577031

    [17]

    PANG Jiangmiao,CHEN Kai,SHI Jianping,et al. Libra R-CNN:Towards balanced learning for object detection[C]// 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition(CVPR). Long Beach,CA,USA. IEEE,2019:821- 830. doi:10.1109/cvpr.2019.00091 doi: 10.1109/cvpr.2019.00091

    [18]

    REDMON J,DIVVALA S,GIRSHICK R,et al. You only look once:Unified,real-time object detection[C]// 2016 IEEE Conference on Computer Vision and Pattern Recognition(CVPR). Las Vegas,NV,USA. IEEE,2016:779- 788. doi:10.1109/cvpr.2016.91 doi: 10.1109/cvpr.2016.91

    [19]

    LIN T Y,DOLLÁR P,GIRSHICK R,et al. Feature pyramid networks for object detection[C]// 2017 IEEE Conference on Computer Vision and Pattern Recognition. Honolulu,HI,USA. IEEE,2017:936- 944. doi:10.1109/cvpr.2017.106 doi: 10.1109/cvpr.2017.106

    [20]

    LIU Shu,QI Lu,QIN Haifang,et al. Path aggregation network for instance segmentation[C]// 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Salt Lake City,UT,USA. IEEE,2018:8759- 8768. doi:10.1109/cvpr.2018.00913 doi: 10.1109/cvpr.2018.00913

    [21]

    DOSOVITSKIY A,BEYER L,KOLESNIKOV A,et al. An image is worth 16 x 16 words:Transformers for image recognition at scale[EB/OL]. 2020:arXiv:2010.11929[cs.CV]. https://arxiv.org/abs/2010.11929.

    [22]

    WOJKE N,BEWLEY A,PAULUS D. Simple online and realtime tracking with a deep association metric[C]// 2017 IEEE International Conference on Image Processing. Beijing,China. IEEE,2017:3645- 3649. doi:10.1109/icip.2017.8296962 doi: 10.1109/icip.2017.8296962

    [23]

    BERNARDIN K,STIEFELHAGEN R. Evaluating multiple object tracking performance:The CLEAR MOT metrics[J]. EURASIP Journal on Image and Video Processing,2008,2008:1- 10. doi:10.1155/2008/246309 doi: 10.1155/2008/246309

    [24]

    MUELLER M,SMITH N,GHANEM B. A benchmark and simulator for UAV tracking[C]// Computer Vision– ECCV 2016,2016:445- 461. doi:10.1007/978-3-319-46448-0_27 doi: 10.1007/978-3-319-46448-0_27

图(4)  /  表(3)
计量
  • 文章访问数:  288
  • HTML全文浏览量:  50
  • PDF下载量:  53
  • 被引次数: 0
出版历程
  • 收稿日期:  2022-05-09
  • 刊出日期:  2022-12-24

目录

    /

    返回文章
    返回