面向噪声和声学混响场景下的语音增强

Speech Enhancement for Noise and Acoustic Reverberation Scenarios

  • 摘要: 语音增强的目的是从受噪声干扰的语音信号中提取纯净的目标语音信号。然而,在混响环境下接收到的声源信号是目标源信号和许多延迟与衰减的反射的集合,这大大降低了目标语音的质量和可懂度。为了探索带噪声和声学混响场景下的语音增强问题,本文在目标语音和声学环境的先验信息未知的情况下,设计一种基于盲信号提取的无监督的多通道语音增强方法。首先,将后期反射产生的混响视为附加的、不相关的噪声分量,构建一个带噪声和声学混响的语音增强新模型,使用原始-对偶分裂算法,通过时频掩码对目标语音信号进行隐式建模。然后,利用倒谱阈值法增强目标语音信号的谐波结构,使得含噪声混响语音信号中的目标语音信号被增强,并且具有比目标语音信号小能量的其他分量被衰减。最后,由于每个信道上的干扰信号都被衰减,使得在每次迭代中提取的目标语音信号具有更好的排他性和非混合性,从而设计一种自适应时频类维纳掩蔽逆滤波器实现去混响去噪声的增强效果。实验部分,分别对噪声和混响条件下的实际语音信号进行了去混响去噪声的性能评估和分析,实验结果表明,所提算法具有很好的去混响去噪声的性能,同时对比于几种比较流行的多通道语音增强算法,验证了本文算法的增强效果更优越。

     

    Abstract: ‍ ‍The purpose of speech enhancement is to extract a pure target speech signal from a noisy speech signal. However, the sound source signal received in a reverberation environment is a collection of the target source signal and many delayed and attenuated reflections, which significantly reduces the quality and intelligibility of the target speech. To explore the problem of speech enhancement in noisy and acoustic reverberation scenarios, this study proposes an unsupervised multichannel speech enhancement method based on blind signal extraction when the prior information of the target speech and acoustic environment is unknown. First, a new speech enhancement model with noise and acoustic reverberation is constructed by considering the reverberation generated by later reflections as additional and unrelated noise components, and the target signal is implicitly modeled through a time-frequency mask using the primal-dual splitting algorithm. Subsequently, the cepstrum threshold method is used to enhance the harmonic structure of the target speech signal, enhancing the target speech signal in the noisy reverberation speech signal and attenuating other components with less energy than the target speech signal. Finally, as the interference signal on each channel is attenuated, the extracted target speech signal in each iteration has better exclusivity and is unmixed, an adaptive time-frequency Wiener masking inverse filtering is designed to enhance dereverberation and denoising. An experiment was conducted to evaluate and analyze the performance of dereverberation and denoising for actual speech signals under noisy and reverberation conditions. The experimental results demonstrated that the proposed algorithm has excellent performance in dereverberation and denoising. Additionally, we verified that the enhancement effect of the proposed algorithm is superior to several popular multi-channel speech enhancement algorithms.

     

/

返回文章
返回