基于联合时空图拓扑结构的多通道语音MVDR增强算法

Multichannel Speech MVDR Enhancement Algorithm Based on Joint Spatial-temporal Graph Topology

  • 摘要: 本文研究图频域内的多通道语音增强,利用图信号处理理论(GSP)构建一种时间-空间维度的联合图拓扑结构,在此基础上设计增强算法进行多通道语音消噪。具体而言,基于输入阵列某个麦克风输入帧间语音顶点信号的时间相关关系,构造时间维度上的一种图拓扑结构;同时针对多通道含噪语音,根据各通道接收信号的空间相关关系,构造空间维度上的一种图拓扑结构。基于时间和空间二种图拓扑构成的联合图拓扑结构,采用图频域内的最小方差无失真响应(MVDR)增强算法,进行多通道语音增强。仿真实验结果表明,在平均客观语音质量评估(PESQ)得分和平均拓展短时客观可懂度(ESTOI)评价指标下,本文所提出的基于联合图拓扑结构的MVDR波束形成(JG-MVDR)方法都优于常规图MVDR波束形成(GMVDR)方法和基于复高斯混合模型的MVDR波束形成(CGMM-MVDR)方法。

     

    Abstract: ‍ ‍In this paper, multichannel speech enhancement in the graph frequency domain is investigated, and a joint graph topology in the spatial-temporal dimension is constructed using graph signal processing (GSP) theory, based on which enhancement algorithms are designed for multichannel speech denoising. Specifically, a temporal graph topology is constructed based on the temporal correlation of speech vertex signals between the input frames of a microphone of the input array; Meanwhile, a spatial graph topology based on the spatial correlation of received signals in each channel is built for multi-channel noisy speech. Based on a joint graph topology composed of temporal and spatial bipartite graph topologies, a joint graph topology-based minimum variance distortionless response (MVDR) enhancement algorithm in the graph frequency domain is used to perform multichannel speech enhancement. Numerical simulation results show that the proposed joint graph topology-based MVDR (JG-MVDR) beamforming method outperforms both the regular graph-based MVDR (GMVDR) beamforming method and the complex Gaussian mixture model based MVDR (CGMM-MVDR) beamforming method in terms of the average perceptual evaluation of speech quality (PESQ) and the average extended short-time objective intelligibility (ESTOI).

     

/

返回文章
返回