联合长短时记忆递归神经网络和非负矩阵分解的语音混响消除方法

A Research to Speech Dereverberation Method Based on BLSTM Recurrent Neural Networks and Non-negative Matrix Factorization

  • 摘要: 本文提出了一种联合长短时记忆递归神经网络和非负矩阵分解方法对单通道语音进行混响消除;对语音信号的对数功率谱建模抑制混响干扰。首先通过长短时记忆递归神经网络估计对数功率谱,这种模型结构能捕获整个音频序列的信息重构纯净语音的对数功率谱,然后通过非负矩阵分解方法对重构的对数功率谱进行后处理抑制过平滑问题;实验结果表明所提方法可以有效抑制语音信号中的混响干扰,本文方法的各种性能指标优于基线方法。

     

    Abstract: This paper presents a two stages speech dereverberation method which combine the bidirectional Long Short Term Memory (BLSTM) recurrent neural network with non-negative matrix factorization (NMF) for a single channel. The log power spectra is selected as features to suppress the reverberation. The BLSTM-RNN which can capture information from anywhere in the feature sequence is used to dereverberated log power spectra firstly and NMF which could alleviate the over-smoothing problem is applied to generated log power spectra in the second stage. Experimental results demonstrate that the proposed method could achieve significant improvements over the different baseline methods.

     

/

返回文章
返回