Multidomain Feature Fusion-Based Low-Altitude Target Recognition Method Using FMCW Radar
-
Abstract
When monitoring low-altitude targets using frequency-modulated continuous-wave (FMCW) radar, hovering unmanned aerial vehicles and birds typically result in echoes with highly similar micro-Doppler signatures. This ambiguity often results in false alarms and missed detections, particularly in cluttered scenes and under low signal-to-noise ratio (SNR) conditions wherein time-frequency textures are weakened. Existing approaches that primarily rely on single-modality spectrogram textures may cause overfitting to specific acquisition settings and degrade when the texture becomes noisy or partially missing. Hence, we investigate micro-Doppler recognition based on a subset of the multiband LSS-FMCWR-1.0 dataset along with the measured FMCW radar echo data, and propose a two-path multidomain feature fusion network called HPMNet, which integrates time-frequency representation and physical priors in an end-to-end manner. The processing pipeline begins with conventional FMCW demodulation and clutter suppression. A target range cell is localized to extract the slow-time signal corresponding to the dominant target response, and a micro-Doppler time-frequency representation is generated as the primary learning input. Additionally, we exploit the complex-valued time-frequency matrix to compute a set of statistical descriptors and then assemble them into a structured feature vector. We design statistics to summarize the distribution, concentration, and variability of time-frequency energy as well as other stable characteristics that are closely related to the target micro-motion dynamics. Consequently, they provide physics-inspired priors that complement the spectrogram textures and offer an interpretable pathway that can remain informative even when the visual texture is degraded. In the network design, we adopt a local-global parallel architecture (LGTF) for the time-frequency branch. The local stream employs multiscale convolutions realized through multiple parallel routes to capture fine-grained textures, transient micro-motion structures, and subtle spectral modulations, and introduces squeeze-and-excitation channel attention to strengthen informative channels while suppressing background components. The global stream follows a bottleneck strategy: First, window-based self-attention is performed in a reduced-resolution feature space to capture periodicity and long-range dependencies with a lower computational cost while enlarging the effective receptive field. Next, the features are upsampled to restore the spatial resolution for subsequent fusion. This local-global design enables the branch to jointly model short-term details and longer-term temporal regularities in micro-Doppler patterns. The physics-prior branch is implemented using a deep and cross network to explicitly model high-order feature interactions among the statistical descriptors, resulting in a compact physical embedding. The two branches are integrated using a gated additive fusion module that adaptively reweights their contributions. By responding to SNR variations and time-frequency texture degradation, the gating mechanism can dynamically shift the reliance between spectrogram textures and physics-inspired statistical cues. This adaptive weighting improves robustness and enhances generalization across scenarios and frequency bands, as the model can emphasize the modality that is more reliable under the current conditions. Experimental results demonstrate that the proposed method achieves an accuracy of 98.8%, a 2.4% improvement over the single-branch baseline. Compared with classical classification methods such as ResNet50, the proposed method not only improves the classification accuracy but also reduces the computational cost and number of parameters. Ablation studies and comparative evaluations further confirm that the physical-statistical branch and gated fusion alleviate the overfitting caused by exclusive dependence on time-frequency textures, maintain favorable robustness under low-SNR conditions, and enhance interpretability by relating model outputs to measurable signal statistics. Therefore, the proposed method provides effective technical support for fine-grained recognition and regulatory applications of low-altitude targets.
-
-