Review of UAV Detection Technology Based on Multimodal Information Fusion
-
Abstract
With the rapid development of the low-altitude economy, the number of low, slow, and small (LSS) civilian multi-rotor unmanned aerial vehicles (UAVs) has increased dramatically, and the security threats posed by “black flying” (illegal flights) have become increasingly prominent. Single-modal detection technologies are constrained by physical boundaries and can no longer meet the requirements of complex tasks, making multimodal information fusion detection the mainstream direction in the industry. This paper systematically reviews the current development status, technological evolution, and future challenges of multimodal detection technologies for LSS civilian multi-rotor UAVs—the core targets of civilian low-altitude “black flying”. This paper first deeply analyzes the performance boundaries and complementary characteristics of single modalities—such as radar, electro-optical, radio frequency (RF), and acoustics—in long-range detection, high-precision recognition, passive detection, and cost control. Second, it highlights the evolutionary logic of multimodal fusion technologies, from the decision (weighted fusion) and feature (heterogeneous feature extraction and interaction) levels to the hybrid level (multi-stage coupling), comparing the advantages, disadvantages, and applicable scenarios of each level. Finally, it details mainstream public datasets such as Anti-Drone and MMAUD, analyzing core elements including sensor configurations, task types, and data synchronization. The study finds that feature-level fusion is currently the mainstream paradigm for improving detection accuracy; however, it still faces bottlenecks in terms of computational resource consumption and heterogeneous data alignment. Hybrid-level fusion possesses a complex architecture; nevertheless, it is the key breakthrough to balancing accuracy and efficiency. Through the analysis of typical cases such as “RF + Optical” field tests, the feasibility of feature joint enhancement and cross-modal track fusion schemes in complex environments is verified, with a detection probability exceeding 95%. The conclusion points out that multimodal fusion effectively solves the problems of single-modal detection, such as susceptibility to environmental interference and high missed-detection rates, significantly enhancing the robustness of the system. Future research should deepen the hybrid-level fusion architecture, promote the application of large-model technologies in multimodal data association and feature matching, and focus on spatial heterogeneous fusion strategies, thereby creating a reference for building an intelligent, all-domain perceived low-altitude regulatory system.
-
-