低信噪比下多参数融合的自适应语音端点检测

Adaptive Speech Endpoint Detection based on Multi-parameter Fusion in Low SNR Situation

  • 摘要: 传统语音端点检测方法利用语音和噪声在某单一参数特征上的差异进行信号中语音起止点的切分,但不同参数在低信噪比不同噪声环境下表现不稳定,鲁棒性差。因此,本文提出了基于均匀子带谱方差,能熵比,梅尔倒谱距离,似然比四种参数相融合的语音端点检测方法。该方法能自适应地改变各参数阈值,并通过实时监测噪声段能熵比的值确定所采用的投票判决机制,从而进行语音端点判定。实验结果表明,该方法在低信噪比下较常用的端点检测方法有更高的检测正确率及鲁棒性,对语音信号后续处理工作有一定的借鉴意义。

     

    Abstract: Traditional speech endpoint detection methods make use of the difference between speech and noise in a single parameter to segment the start and end points of speech in the signal. However, the performance of different parameters under different noise environments with low signal-to-noise ratio is unstable and the robustness is poor. To overcome such problem, this paper proposed a speech endpoint detection method based on the fusion of four parameters: sub-band spectral variance, energy entropy ratio, MFCC cepstrum distance and likelihood ratio. This method could change the threshold of each parameter adaptively, then determined the voting mechanism by real-time detection of the energy entropy ratio of the noise segment, so as to determine the speech endpoint. Experimental results show that the proposed method has higher detection accuracy and robustness than the conventional endpoint detection methods in the case of low signal-to-noise ratio. The proposed method has certain reference significance for the follow-up processing of speech signal.

     

/

返回文章
返回