嵌入时延网络的高斯混合背景模型说话人确认

Speaker Verification using GMM-UBM with Embedded TDNN

  • 摘要: 本文提出了一种嵌入时延神经网络(TDNN)的高斯混合背景模型(GMMUBM)说话人确认方法,它集成了作为判别性方法的时延神经网络和作为生成性方法的高斯混合模型各自的优点。该方法利用时延神经网络挖掘特征向量集的时序性,然后把时间信息传递给GMM;并且通过时延网络的变换使需要假设变量独立的最大似然概率(ML)方法更为合理。该方法利用极大似然概率作为训练准则,把高斯混合模型和神经网络作为整体来进行训练。训练过程中,高斯混合模型和神经网络的参数交替更新。实验结果表明,采用本文提出的方法结合TNorm比基线系统的等误差率(EER)降低28%。

     

    Abstract: This paper proposes a modified Gaussian Mixed Model-Universal Background Model (GMM-UBM) with an embedded Time Delay Neural Network (TDNN) It integrates the merits of GMM which is a generative model and TDNN as a Discriminative model. TDNN digests the time information of the feature sets, and transmits the information to GMM. Also through the transformation of the feature vectors it makes the hypothesis of variable independence that maximum likelihood needed more reasonable. We train GMM and TDNN as a whole by means of maximum likelihood. In the process of training, the parameters of GMM and TDNN are updated alternately. Experiments show that using the method with TNorm can reduce EER about 28% against baseline GMM-UBM

     

/

返回文章
返回