基于深度神经网络的单通道语音增强方法回顾

Review of Monaural Speech Enhancement Based on Deep Neural Networks

  • 摘要: 语音增强是一种试图从噪声中分离出语音的技术,目的是提高语音的质量和可懂度。在过去的几十年里,人们提出了多种类型的语音增强方法,但这些方法在非平稳噪声环境中的表现还未达到最佳程度,因为他们没有充分利用语音和噪声的先验信息。近年来,随着深度学习的发展,深度神经网络已成为当下实现语音增强的主流方法,在改善语音质量和提升可懂度方面发挥了积极作用。本文从深度神经网络的结构出发,回顾了基于深度学习的单通道语音增强方法。首先,介绍了语音增强的背景;其次,详细描述了四种不同类型神经网络实现语音增强的方法;最后,给出了未来语音增强方法的建议和本文的结论。

     

    Abstract: Speech enhancement tries to separate speech from noise and aims to improve the quality and intelligibility of speech. In the past several decades, many types of speech enhancement methods have been proposed. However, these methods cannot always achieve the best performance for non-stationary noise because they do not make best use of prior information of speech and noise. In recent years, with the advance of deep learning, the deep neural network (DNN) has become a mainstream strategy to conduct speech enhancement, and is playing important role in improving speech quality and increasing intelligibility. Based on the structure the DNN, the DNN-based monaural speech enhancement methods are reviewed in this paper. First, the background of speech enhancement is introduced. Next, four different types of the DNN used for conducting speech enhancement are carefully described. Finally, some comments of speech enhancement for future work and conclusions are given.

     

/

返回文章
返回