Abstract:
The keywords spotting system based on point process model is a novel keyword spotting system in continuous speech. Although this system has the advantage of less demanding on samples number and fast calculation, but its performance is mostly depends on the accuracy of the front phoneme detector. However, the Gaussian mixture model which is widely used in the phoneme detector has weaknesses in representation and modeling. To solve this problem, this paper proposes a point process model embedded with deep belief networks and use it for keywords spotting. This model establishes a phoneme detector using deep belief networks, which has a prominent capability to represent features, to overcome GMM’s shortage in feature representation. Experimental results show that this method can obtain a higher detection rate than the original model and reduce the computational complexity, and it can meet the real-time requirement of spotting keywords preferably.