SHU Wei-Xin, GUO Wu. Speaker Diarization Based on Length Normalization MAP[J]. JOURNAL OF SIGNAL PROCESSING, 2016, 32(7): 859-865. DOI: 10.16798/j.issn.1003-0530.2016.07.014
Citation: SHU Wei-Xin, GUO Wu. Speaker Diarization Based on Length Normalization MAP[J]. JOURNAL OF SIGNAL PROCESSING, 2016, 32(7): 859-865. DOI: 10.16798/j.issn.1003-0530.2016.07.014

Speaker Diarization Based on Length Normalization MAP

  • We proposed a length normalization maximum a posterior (MAP) algorithm, which can be applied to Cross Likelihood Ratio (CLR) and TTest distance metric methods in speaker diarization.Since the shift from the UBM in adaptation procedure is based on statistics calculated against the Universal Background Model (UBM),the model parameters obtained from the classical MAP method have a positive correlation with the length of the speech segment. When measuring the similarity of two segments with different length, the classical MAP method will bring about speaker models variability, which would affect the distance metric in speaker diarization. We proposed to apply length normalization to the relevant factor before adapting the parameters of the speaker model.Hence, the model parameters are irrelevant to the length of the speech, and it can reflect the speakers identity better.In the speaker diarization task of a Chinese multispeaker TV talk show,Compared with the classical MAP, the proposed normalized MAP method can reduce the diarization error rate by 35% in the CLR clustering method and by 107% in the TTest clustering method.
  • loading

Catalog

    /

    DownLoad:  Full-Size Img  PowerPoint
    Return
    Return