WANG Jinyu, SUN Fengsong, WANG Yinhao, et al. Joint optimization for multi-UAV sensing based on integrated sensing and communicationJ. Journal of Signal Processing, 2026, 42(2): 131-147.DOI: 10.12466/xhcl.2026.02.002.
Citation: WANG Jinyu, SUN Fengsong, WANG Yinhao, et al. Joint optimization for multi-UAV sensing based on integrated sensing and communicationJ. Journal of Signal Processing, 2026, 42(2): 131-147.DOI: 10.12466/xhcl.2026.02.002.

Joint Optimization for Multi-UAV Sensing Based on Integrated Sensing and Communication

  • Low-Altitude Wireless Networks (LAWNs) serve as a critical infrastructure pillar for achieving extensive wide-area coverage and intelligent sensing capabilities. Nevertheless, the overall efficacy of these networks is severely constrained by rigid limitations, specifically the scarcity of spectrum resources and the restricted capabilities of onboard hardware. To mitigate these challenges, Integrated Sensing and Communication (ISAC) technology has emerged as a promising solution. By enabling the sharing of spectrum and hardware resources, ISAC not only enhances resource utilization efficiency but also facilitates a deeper integration of sensing and communication functions. Building upon ISAC, multi-Unmanned Aerial Vehicle (UAV) cooperative sensing offers a pathway to transcend the performance bottlenecks inherent to single-UAV systems, particularly regarding coverage scope and sensing precision. Consequently, this approach represents a pivotal strategy for satisfying the stringent requirements of low-altitude supervision and regulation. However, the introduction of multi-UAV collaboration engenders complex spatial geometric constraints and co-channel interference. Furthermore, in three-dimensional dynamic environments, the strong coupling between communication and sensing tasks renders the joint optimization of UAV trajectories, transmit power, subcarrier allocation, and user association a highly intractable Mixed-Integer Non-Linear Programming (MINLP) problem. Additionally, classical Multi-Agent Reinforcement Learning (MARL) algorithms encounter significant bottlenecks when applied to large-scale networks, including state space explosion, sluggish convergence speeds, and low efficiency in policy search. These limitations hinder their ability to meet the stringent real-time demands of dynamic cooperative sensing tasks. In response to these challenges, this paper proposes a Quantum-enhanced Hierarchical Multi-Agent Proximal Policy Optimization (Q-H-MAPPO) algorithm, designed to maximize the joint effectiveness of multi-UAV cooperative sensing and communication. Initially, a joint optimization model tailored for cooperative sensing is constructed. The Cramér-Rao Lower Bound (CRLB) is adopted as the key performance metric for sensing to quantify the impact of multi-UAV geometric configurations on sensing accuracy. The objective is to minimize the positioning error while simultaneously guaranteeing the Quality of Service (QoS) for communication tasks. Subsequently, a Centralized Training and Decentralized Execution (CTDE) framework is employed to design a Hierarchical Markov Decision Process (H-MDP). This hierarchical structure facilitates task decomposition, thereby achieving effective decoupling of discrete and continuous variables. Furthermore, the study introduces an amplitude encoding mechanism based on quantum variational circuits and designs a Graph Attention Mechanism inspired by the concept of quantum swap tests. By simulating the feature mapping capabilities of quantum states within a classical computing framework, the proposed method efficiently extracts non-linear cooperative relationships among multiple agents as well as critical channel state information. Simulation results indicate that the proposed Q-H-MAPPO algorithm demonstrates superior performance in multi-target, high-load dynamic scenarios. In a specific scenario involving six targets, the algorithm reduces the sensing positioning CRLB to approximately 0.18 m, representing a reduction of at least 7% compared with other benchmark methods. In a large-scale networking scenario involving 15 UAVs, the system sum rate achieves approximately 31.5 Mbps, reflecting an improvement of approximately 21% to 44% over the aforementioned baseline methods. Moreover, when the network scales to 20 UAVs, the inference latency remains stable between 20 and 26 ms that is a reduction of approximately 81% to 87% compared with typical baselines. The algorithm also exhibits the fastest convergence speed, achieving stability within merely 150 to 200 training episodes. These findings validate the significant advantages of Q-H-MAPPO in enhancing the cooperative sensing accuracy, system throughput, and decision-making real-time performance within large-scale low-altitude networks.
  • loading

Catalog

    /

    DownLoad:  Full-Size Img  PowerPoint
    Return
    Return