GAO Jiupeng, SUN Tianchi, CHEN Kai, et al. Lightweight multichannel speech enhancement based on reparameterized convolutionJ. Journal of Signal Processing, 2025, 41(12): 1967-1979.DOI: 10.12466/xhcl.2025.12.009.
Citation: GAO Jiupeng, SUN Tianchi, CHEN Kai, et al. Lightweight multichannel speech enhancement based on reparameterized convolutionJ. Journal of Signal Processing, 2025, 41(12): 1967-1979.DOI: 10.12466/xhcl.2025.12.009.

Lightweight Multichannel Speech Enhancement Based on Reparameterized Convolution

  • Multichannel speech enhancement leverages the spatial perception of microphone arrays to extract high-quality target speech from noisy mixtures, thereby serving as a critical preprocessing stage for automatic speech recognition, teleconferencing, and assistive hearing. Although deep neural approaches currently dominate—ranging from hybrids that couple learning with classical spatial filtering to fully neural beamforming—their deployment on edge devices remains difficult. Models must simultaneously satisfy strict real-time causality, tight compute and memory budgets, and high accuracy under low signal-to-noise ratio and nonstationary, spatially complex noise. Existing lightweight solutions often fall short of this triad, and methods that stay below a few hundred multiple model adaptive controls per second (MMACs/s) while remaining competitive at low SNR are rare. To address these limitations, we propose a multi-branch causal network (MBCNet), which has a deployment-oriented, lightweight multichannel architecture built around convolutional reparameterization. MBCNet jointly encodes auditory features, complex spectral representations, and spatial cues. Its backbone comprises three parts: (i) a parallel feature encoder that aligns and fuses the three streams; (ii) a deep extractor with symmetric encoder-decoder and multilevel frequency downsampling-upsampling blocks to expand the effective frequency receptive field; and (iii) a mask estimation head that predicts multichannel complex filters for enhanced signal reconstruction. Self-attention components are integrated where beneficial to capture the long-range dependencies without violating causality. The first key contribution is the reparameterizable multibranch convolution (RepMBConv). During training, RepMBConv uses five coordinated branches—temporal, spectral, joint time-frequency, refinement, and identity—to enrich feature diversity and learn complementary inductive biases. At inference, the branches are analytically fused into a single convolutional kernel through linear equivalence, incurring zero extra computational overhead. Branch-importance analysis further reveals a hierarchical learning behavior, whereby shallow stages emphasize local refinement, whereas deeper stages prioritize temporal and spectral abstractions. We exploited this property after convergence to add, prune, and fine-tune branches, reallocating capacity to critical channels and scales to yield measurable gains without increasing complexity. The second contribution is a frequency downsampling-upsampling module that replaces conventional pairs of convolution and transpose convolution. Downsampling is realized by frequency-index splitting, channel stacking, and convolution, with upsampling reversing this process via channel separation, frequency-index recombination, and convolution. This design doubles the frequency receptive field without increasing computational cost, improves broadband noise suppression, and avoids the artifacts associated with deconvolution, all while preserving streaming causality. Ablation studies confirm RepMBConv’s superiority over standard and dilated convolutions under matched complexity, demonstrating that removing spatial or complex-domain features degrades performance. In comparative experiments, MBCNet achieves superior or comparable denoising performance with fewer parameters and lower computational cost, validating its effectiveness and deployment potential on edge devices.
  • loading

Catalog

    /

    DownLoad:  Full-Size Img  PowerPoint
    Return
    Return