[1] |
Xu Ji, Zhang Ge, Yan Yonghang. Effective utilization of multiple examples in Query-by-Example spoken term detection [C]// ICASSP 2016. Shanghai, China. 2016: 5440-5444.
|
[2] |
Zhang Yichi, Duan Zhiyao. IMISound: an unsupervised system for sound Query by vocal imitation [C]// ICASSP 2016. Shanghai, China. 2016: 2269-2273.
|
[3] |
Stefan Balke, Vlora Arifi-Muller, Lukas Lamprecht, Meinard Miiller. Retrieving Audio Recordings using musical themes [C]// ICASSP 2016. Shanghai, China. 2016: 281-285.
|
[4] |
David R. H. Miller, Michael Kleber, Chia-lin Kao, et al. Rapid and accurate spoken term detection [C]// Interspeech 2007. Antwerp, Belgium. 2007: 314-317.
|
[5] |
Xu Haihua, Hou Jingyong, Xiao Xiong, et al. Approximate search of audio queries by using DTW with phone time boundary and data augmentation [C]// ICASSP 2016. Shanghai, China. 2016: 6030-6034.
|
[6] |
Zhang Yaodong, Glass, J. Unsupervised spoken keyword spotting via segmental DTW on Gaussian posteriorgrams[A]. In: Prco. of IEEE Automatic Speech Recognition and Understanding Workshop [C]. Merano, Italy. 2009:398-403.
|
[7] |
Wang Haipeng, Leung C, Lee T, et al. An acoustic segment modeling approach to query-by-example spoken term detection [C]// ICASSP 2012. Kyoto, Japan. 2012: 5157-5160.
|
[8] |
Chung Cheng-tao, Chan Chun-an, Lee Lin-shan. Unsupervised discovery of linguistic structure including two-level acoustic patterns using three cascaded stages of iterative optimization [C]// ICASSP 2013. Vancouver, Canada. 2013: 8081-8085.
|
[9] |
Chengtao Chung, Weining Hsu, Chengyi Lee, and Linshan Lee. Enhancing automatically discovered multi-level acoustic patterns considering context consistency eith applications in spoken term detection [C]// ICASSP 2015. Brisbane, Australia. 2015: 5231-5235.
|
[10] |
Brenden M. Lake, Chia-ying Lee, James R. Glass, and Joshua B. Tenenbaum. One-shot learning of generative speech concepts [C]. In Proceedings of the 36th Annual Meeting of the Cognitive Science Soceity, 2014: 803-808.
|
[11] |
Chia-ying Lee and James Glass. A nonparametric Bayesian approach to acoustic model discovery. In Proceedings of ACL[C], 2012: 40-49.
|
[12] |
S. Ganapathy. Signal analysis using autoregressive models of amplitude modulation [D]. Baltimore, Maryland, USA: Johns Hopkins University, 2012:60-68.
|
[13] |
G. Mantena, S. Achanta, and K. Prahallad. Query-by-example spoken term detection using frequency domain linear prediction and non-segmental dynamic time warping [J]. IEEE Transactions on Audio Speech and Language Processing, 2014, 22(5): 946-955.
|
[14] |
The, Y., Jordan, M., Beal, M., &Blei, D.. Hierarchical Dirichlet Processes [J]. Journal of the American Statistical Association, 2006, 101(47), 1566-1581.
|
[15] |
D.D. Lee, H. S. Seung, Learning the parts of objects by nonnegative matrix factorization [J]. Nature, October 1999, vol. 401, pp 1451 - 1454.
|
[16] |
John S. Garofolo, Lori F. Lamel, William M. Fisher, et al. TIMIT acoustic-phonetic continuous speech (MS-WAV version) [J]. Journal of the Acoustical Society of America, 1990, 88(88):210-221.
|