Abstract:
As the confidence measures in the scheme of Hidden Markov Model (HMM) in keyword spotting system have some shortcomings, a confidence measure based on frame-level sub-word posterior probability of Multi-layer Perception (MLP) is presented in this paper. Conventionally, the confidence is calculated from the acoustic and language model scores computed by the recogniser of HMM model, which makes some incorrect assumptions, such as the frame-wise and possibly component-wise independence of acoustic features, and a finite number of Gaussian mixtures. The proposed confidence measure is directly calculated from the frame-level sub-word posterior probabilities produced by a MLP network. The confidence estimation is completely separated from the keyword spotting and they use two different models. With this separation, decision making can be addressed with more reliable confidence and multiple confidence features can be integrated to improve the decision quality. The experimental results show that the proposed approach in this paper is better than the mainstream confidence measures in the framework of HMM model and they have good complement, when combining with the mainstream confidence measures in the scheme of HMM model, the Equal Error Rate (EER) of keyword spotting system achieves 11.5% relative improvement.