基于深度学习的两阶段联合声学回波和混响抑制技术

A Two-stage Deep Learning Based Method for Acoustic Echo Cancellation and Speech Dereverberation

  • 摘要: 在现代通信系统中,通信语音的质量和可懂度会被回波与混响严重损害,人与人之间的交流因此会被严重干扰。为了同时消除回波与混响的负面影响,本文提出了一种基于深度学习的两阶段联合声学回波和混响抑制系统。该系统逐步地消除加性声学回波与多径效应产生的混响干扰,从而获得目标语音。系统首先使用基于理想比值掩蔽(Ideal Ratio Mask,IRM)的模型去除与目标信号不相关的声学回波,紧接着对于与目标信号强相关的混响干扰,系统通过利用一个基于“隐掩蔽”的谱映射模型将其去除。两阶段模型最后进行联合训练以获得更好的系统性能。一系列不同声学环境下的实验结果表明,本文所提出的系统可显著地消除回波与混响干扰,从而极大地增强了目标语音的语音质量与可懂度。

     

    Abstract: In modern telecommunications, both echo and reverberation can significantly disturb people's communication and degrade the speech intelligibility and quality. In order to overcome the negative impact of the echo and reverberation simultaneously, we proposed a two-stage joint-training system based on deep learning to enhance the speech signal, where echo cancellation and speech dereverberation were conducted sequentially. The system is composed of two stages, echo cancellation stage and dereverberation stage. The system firstly employed a model based on ideal ratio mask (IRM) to cancel the acoustic echo, which was uncorrelated with the target signal. Then the reverberation strongly correlated with the target signal was removed using a spectrum mapping model combined with a hidden mask. Then the two-stage model was jointly trained to obtain a better performance. A series of systematic experiments were conducted in different conditions and the results indicated that the proposed system significantly improves the performance on echo cancellation and dereverberation and achieves better speech intelligibility and quality over other methods.

     

/

返回文章
返回