采用复高斯分布模型的两步噪声幅度谱估计算法

欧世峰, 刘伟, 宋鹏, 赵晓晖

欧世峰, 刘伟, 宋鹏, 赵晓晖. 采用复高斯分布模型的两步噪声幅度谱估计算法[J]. 信号处理, 2017, 33(7): 918-926.
引用本文: 欧世峰, 刘伟, 宋鹏, 赵晓晖. 采用复高斯分布模型的两步噪声幅度谱估计算法[J]. 信号处理, 2017, 33(7): 918-926.
Two-Step Noise Amplitude Estimators Using Complex Gaussian Distribution Model[J]. JOURNAL OF SIGNAL PROCESSING, 2017, 33(7): 918-926.
Citation: Two-Step Noise Amplitude Estimators Using Complex Gaussian Distribution Model[J]. JOURNAL OF SIGNAL PROCESSING, 2017, 33(7): 918-926.

采用复高斯分布模型的两步噪声幅度谱估计算法

基金项目: 国家自然科学基金项目;山东省自然科学基金
详细信息
    通讯作者:

    欧世峰   E-mail: ousfeng@126.com

  • 中图分类号: TN912.3

Two-Step Noise Amplitude Estimators Using Complex Gaussian Distribution Model

  • 摘要: 噪声幅度谱估计是有效抑制外界噪声干扰、提高语音增强算法整体输出性能的重要环节。但目前针对该问题的研究相对较少,常用的语音激活检测算法只能在语音不存在阶段对噪声信号的幅度谱进行更新或估计,无法适用于更为复杂的非平稳噪声环境。为克服这一问题,本文基于噪声频谱的复高斯分布模型假设,提出了新型的两步噪声幅度谱估计算法。算法首先采用软判决技术计算噪声信号的功率谱,然后再结合复高斯分布条件下信号幅度谱和功率谱之间的数学关系间接地获取噪声幅度谱的估计。文中基于这一结论给出了两种估计算法,并在多种噪声环境下对它们的性能进行了仿真评估,其测试结果有效表明了提出算法优良的估计性能。
    Abstract: The estimate for the amplitude of noise signal plays an important role in many noise reduction or speech enhancement methods. However, compared with the noise power estimation, less attention has been paid to the amplitude estimation in the past years. In addition, the frequently-used voice activity detection (VAD) algorithm estimates or updates the noise amplitude during only speech absence area, which leads to an inferior performance in non-stationary noise conditions. To overcome such drawback, tow novel noise amplitude estimators working with two steps are proposed in this paper based on the assumption of complex Gaussian model. The estimation of noise power is achieved by soft decision (SD) method in the first step, and then the indirect estimators are subsequently obtained by using the relationship between the power and amplitude under the complex Gaussian distribution. The results of simulations indicated that the presented estimators can lead to significantly better speech quality than the frequently-used VAD method under various noise conditions.
  • [1] J. Li, S. Sakamoto, S. Hongo. Adaptive β-order generalized spectral subtraction for speech enhancement [J]. Signal Processing, 2008, 88(11): 2764-2776.
    [2] A. Borowicz, A. Petrovsky. Signal subspace approach for psychoacoustically motivated speech enhancement [J]. Speech Communication, 2011, 53(2): 210-219.
    [3] J. Chen, J. Benesty J, Y. Huang. New insights into the noise reduction Wiener filter [J]. IEEE Transactions on audio, speech, and language processing, 2006, 14(4): 1218-1234.
    [4] Y. Ephraim, D. Malah. Speech enhancement using a minimum mean-square error short-time Sspectral amplitude estimator [J]. IEEE Transactions on Acoust. Speech Signal Processing, 1984, 32(6): 1109-1102.
    [5] M. Djendi, P. Scalart. Reducing over-and under-estimation of the a priori SNR in speech enhancement techniques [J]. Digital Signal Processing, 2014, 32: 124-136.
    [6] R. Marti. Noise power spectral density estimation based on optimal smoothing and minimum statistics [J]. IEEE Transactions on Speech and Audio Processing, 2001, 9(5): 504-512.
    [7] Y. S. Park and J. H. Chang. A probabilistic combination method of minimum statistics and soft decision for robust noise power estimation in speech enhancement [J]. IEEE Signal Processing Letters, 2008, 15(1): 95-98.
    [8] I. Cohen. Noise spectrum estimation in adverse environments: improved minima controlled recursive averaging [J]. IEEE Transactions on Speech and Audio Processing, 2003, 11(5): 466-475.
    [9] T. Inoue, H. Saruwatari, Y. Takahashi. Theoretical analysis of musical noise in generalized spectral subtraction based on higher order statistics [J]. IEEE Transactions on Audio, Speech, and Language Processing, 2011, 19(6): 1770-1779.
    [10] P. C. Loizou, Speech enhancement: theory and practice [M]. CRC Press, Boca Raton, FL, 2007.
    [11] 廖逢钗, 李鹏, 徐波. 音乐噪声环境下的双声道语音活动检测 [J]. 信号处理, 2009, 25(11): 1820-1824. F. Liao, P. Li, B. Xu. Dual-channel voice activity detection in music noise envoronments [J]. Journal of Signal Processing, 2009, 25(11): 1820-1824.
    [12] X. Zhang, D. Wang. Boosting contextual information for deep neural network based voice activity detection [J]. IEEE Transactions on Audio, Speech, and Language Processing, 2016, 24(2): 252-264.
    [13] X. Zhang, J. Wu. Deep belief networks based voice activity detection [J]. IEEE Transactions on Audio, Speech, and Language Processing, 2013, 21(4): 697-710.
    [14] A. Abramson, I. Cohen. Simultaneous detection and estimation approach for speech enhancement [J]. IEEE Transactions on Audio, Speech, and Language Processing, 2007, 15(8): 2348-2359.
    [15] A. Papoulis, U. Pillai. Probability, random variables and stochastic processes [M], McGraw-Hill. 2011.
    [16] S. R. Quackenbush, T. P. Barnwell, M. A. Clements. Objective measures of speech quality [M]. Prentice Hall, 1988.
    [17] Y. Hu. and Loizou, P. Evaluation of objective quality measures for speech enhancement [J]. IEEE Transactions on Speech and Audio Processing, 2008, 16(1), 229-238.
计量
  • 文章访问数:  80
  • HTML全文浏览量:  3
  • PDF下载量:  8
  • 被引次数: 0
出版历程
  • 收稿日期:  2016-12-04
  • 修回日期:  2017-03-15
  • 发布日期:  2017-07-24

目录

    /

    返回文章
    返回