基于最大似然子带线性回归的鲁棒语音识别

Maximum Likelihood Subband Linear Regression for Robust  Speech Recognition

  • 摘要: 在实际环境中,训练环境和测试环境的失配会导致语音识别系统的性能急剧恶化。模型自适应算法是减小环境失配影响的有效方法之一,它通过少量自适应数据将模型参数变换到识别环境。最大似然线性回归是一种常用的基于变换的模型自适应算法,本文针对最大似然线性回归算法在数据较少时模型参数估计不准确的缺点,提出了基于最大似然子带线性回归的模型自适应算法。该算法将Mel滤波器组的全部通道划分为若干个子带,假设每个子带内多个通道的模型均值分量共享一个线性环境变换关系,以增加可用的数据。实验表明,本文算法可以较好地克服数据稀疏问题,只需要很少的数据即可取得较好的自适应效果,尤其适合于少量数据时的快速模型自适应。

     

    Abstract: In real environments the performance of speech recognition system may be significantly degraded because of the mismatch between the training and testing conditions.Model adaptation is an efficient approach that could reduce this mismatch,which adapts model parameters to new conditions by a small amount of adaptation data.Maximum likelihood linear regression (MLLR) is a popular transformation-based model adaptation algorithm.However it may degrade the performance of speech recognition system when only a few data are available.In this paper,a new model adaptation using maximum likelihood sub-band linear regression (MLSLR) is presented,which divides the full channels of Mel filter bank into several sub-bands and uses linear function to approximate the relationship between training and testing mean vectors in every sub-band.The experimental results show that the proposed algorithm overcomes the sparse data problem preferably and requires only a small amount of data.Therefore,it is more useful for rapid model adaptation.

     

/

返回文章
返回