Abstract:
This paper presents a study of hierarchical Dirichlet processing hidden Markov model (HDPHMM) approach for unsupervised query-by-example spoken term detection (QbE-STD). First a hierarchical hidden Markov model is applied,in which the top layer states are used for representing the finding acoustic units, bottom layer states are used for modeling the emission probability of top layer states. We can get a nonparametric Bayesian model HDPHMM when imposing a hierarchical Dirichlet processing prior on the top layer states. After the model is trained by unlabeled speech data, it outputs posteriorgram feature vector for test utterance and query term. The posteriorgram feature is optimized by non-negative matrix factorization algorithm. Then the detection is performed by modified SDTW algorithm. Experimental results show that the proposed method outperforms the baseline system based on Gaussian mixture model tokenizer, and improve the detection precision obviously.