基于非负矩阵分解的语音增强方法综述

鲍长春; 白志刚

doi:10.16798/j.issn.1003-0530.2020.06.001

基于非负矩阵分解的语音增强方法综述

Speech Enhancement Based on Nonnegative Matrix Factorization: An Overview

摘要

摘要: 语音增强在语音信号处理领域举足轻重，其目的在于减少背景噪声对语音信号的影响。然而，如何从极度非平稳噪声环境下有效地分离出目标语音仍然是一个具有挑战性的问题。基于非负矩阵分解(Nonnegative matrix factorization, NMF)的语音增强算法利用非负的语音和噪声基矩阵来建模语音和噪声的频谱子空间，是目前一种先进的对抑制非平稳噪声非常有效的技术。本文首先详细地介绍了非负矩阵分解理论，包括非负矩阵分解模型，代价函数(Cost function)的定义以及常用的乘法更新准则(Multiplicative update rules)。然后，本文详细地介绍了基于非负矩阵分解的语音增强方法的基本原理，包括训练阶段和增强阶段的具体过程，并进行了实验，此外，还利用一个基于非负矩阵分解的语音重构实验验证了语音基矩阵对语音频谱的建模能力。最后，本文总结了传统的基于非负矩阵分解的算法的不足，并对一些现有的基于非负矩阵分解的算法分别做了一个简单的概述，包括其创新点和优缺点，并对比分析了几种具有代表性的方法。本文从历史的角度展示了基于非负矩阵分解的语音增强方法的不断发展。

Abstract: As an important application of speech signal processing, speech enhancement aims to reduce the influence of background noise on speech signals. However, how to effectively separate target speech in extremely nonstationary noise environment is still a challenging problem. Speech enhancement based on nonnegative matrix factorization (NMF) is currently an advanced and effective technique for suppressing nonstationary noise, which models spectral subspaces of speech and noise using nonnegative basis matrices. First, in this paper, the theory of nonnegative matrix factorization is introduced in details, including the model of the NMF, the definition of cost functions and the commonly used multiplicative update rules. Then, the basic principle of the NMF-based speech enhancement methods is reviewed in details, including the specific processes of the training and enhancement stages, and the experiments are carried out. In addition, an NMF-based speech reconstruction experiment is used to verify the ability of speech basis matrix for modeling the speech spectrums. Finally, the shortcomings of the traditional NMF-based algorithms are summarized, and some existing NMF-based algorithms are respectively briefly reviewed including their innovations, advantages and disadvantages. Moreover, several typical methods are analyzed and compared. This paper shows the continuous developments of the NMF-based speech enhancement methods in a historical perspective.

HTML全文

参考文献(52)

施引文献

资源附件(0)