基于双层字典学习的单通道语音增强方法

A single-channel speech enhancement method based on double-layer dictionary

  • 摘要: 为了提升复杂噪声环境下语音增强效果,该文提出了一种基于双层字典学习的单通道语音增强方法。在训练阶段首先采用干净语音和噪声训练初始化特征子字典,然后基于区分性约束和抗混淆约束的优化函数训练双层联合字典,第一层字典表达语音信号和噪声的可区分分量,而第二层字典表达语音信号和噪声的易混淆成分。在测试阶段含噪语音在双层联合字典上投影得到稀疏系数矩阵,然后重构得到增强后的语音。该方法利用目标优化函数的约束性减少“交叉投影”现象的发生,降低了信号在联合字典的混淆,从而进一步提升了语音增强的效果。实验结果表明,从全局信噪比(SNR)、主观语音质量评估(PESQ)和对数频谱距离(LSD)三个方面评价,相比于基于稀疏约束非负矩阵分解和改进的维纳滤波的语音增强方法,该方法具有更好的性能,能够更有效地去除噪声。

     

    Abstract: A single-channel speech enhancement based on jointly constrained double-layer dictionary learning is proposed to improve the quality of speech in the complex noisy environment.Firstly, the characteristic sub-dictionaries that describe the clean speech and noisy speech are trained. Then, with the new optimization function of discriminative constraints and anti-substitution constraints, a double-layer joint dictionary is trained. The first layer dictionary expresses the separable components of the speech signal and noisy signal, and the second layer expresses easily decomposed components of the speech signal and noisy signal. The constraint of the objective optimization function is used to reduce the occurrence of "cross-projection" phenomenon and the confusion of the signals in the joint dictionary. Furthermore, we can improve the effect of speech enhancement through the double-layer dictionary. The experimental results show that compared with the speech enhancement methods based on the non-negative matrix factorization with sparsity-regularized constraints and the improved wiener filtering, in three aspects including Signal to Noise Ratio(SNR),Perceptual Evaluation of Speech Quality(PESQ) and Logarithmic Spectral Distance(LSD), the proposed method has better performance and can remove noise more effectively.

     

/

返回文章
返回