[1] |
Hinton G, Deng L, Yu D, et al.Deep Neural Networks for Acoustic Modeling in Speech Recognition: The Shared Views of Four Research Groups[J].IEEE Signal Processing Magazine, 2012, 29(6):82-97
|
[2] |
Valtchev V, Odell J, Woodland P C, et al.Lattice-based discriminative training for large vocabulary speech recognition[C]// Acoustics, Speech, and Signal Processing, 1996. on Conference Proceedings. 1996 IEEE International Conference. IEEE Computer Society, 1996:605-608.
|
[3] |
Povey D, Kanevsky D, Kingsbury B, et al.Boosted MMI for model and feature-space discriminative training[C]// IEEE International Conference on Acoustics, Speech and Signal Processing. IEEE, 2008:4057-4060.
|
[4] |
Povey D, Woodland P C.Minimum Phone Error and I-smoothing for improved discriminative training[C]// IEEE International Conference on Acoustics, Speech, and Signal Processing. IEEE, 2002:I-105-I-108.
|
[5] |
Povey D, Kingsbury B.Evaluation of Proposed Modifications to MPE for Large Scale Discriminative Training[C]// IEEE International Conference on Acoustics, Speech and Signal Processing. IEEE, 2007:IV-321 - IV-324.
|
[6] |
陈雷, 杨俊安, 王一, 等.LVCSR系统中一种基于区分性和自适应瓶颈深度置信网络的特征提取方法[J].信号处理, 2015, 31(3):290-298.
|
[7] |
Voigtlaender P, Doetsch P, Wiesler S, et al.Sequence-discriminative training of recurrent neural networks[C]// IEEE International Conference on Acoustics, Speech and Signal Processing. IEEE, 2015:2100-2104.
|
[8] |
Graves A.Connectionist Temporal Classification[M]// Supervised Sequence Labelling with Recurrent Neural Networks. Springer Berlin Heidelberg, 2012:61-93.
|
[9] |
Graves A, Jaitly N.Towards end-to-end speech recognition with recurrent neural networks[C]// International Conference on Machine Learning. 2014:1764-1772.
|
[10] |
Miao Y, Gowayyed M, Metze F.EESEN: End-to-end speech recognition using deep RNN models and WFST-based decoding[C]// Automatic Speech Recognition and Understanding. IEEE, 2016:167-174.
|
[11] |
Cho K, Merrienboer B V, Gulcehre C, et al.Learning Phrase Representations using RNN Encoder-Decoder for Statistical Machine Translation[J].Computer Science, 2014, 00(0):00-00
|
[12] |
Bahdanau D, Cho K, Bengio Y.Neural Machine Translation by Jointly Learning to Align and Translate[J]. Computer Science, 2014,00(0):00-00
|
[13] |
Xu K, Ba J, Kiros R, et al.Show, Attend and Tell: Neural Image Caption Generation with Visual Attention[J]. Computer Science, 2015,00(0):00-00
|
[14] |
Chorowski J, Bahdanau D, Serdyuk D, et al.Attention-Based Models for Speech Recognition[J]. Computer Science, 2015,00(0):00-00
|
[15] |
Bahdanau D, Chorowski J, Serdyuk D, et al.End-to-end attention-based large vocabulary speech recognition[C]// IEEE International Conference on Acoustics, Speech and Signal Processing. IEEE, 2016:4945-4949.
|
[16] |
Jozefowicz R, Zaremba W, Sutskever I.An empirical exploration of recurrent network architectures[C]// International Conference on International Conference on Machine Learning. JMLR.org, 2015:2342-2350.
|
[17] |
Hochreiter S, Schmidhuber J.Long Short-Term Memory[J].Neural Computation, 1997, 9(8):1735-1762
|
[18] |
Zhou G B, Wu J, Zhang C L, et al.Minimal Gated Unit for Recurrent Neural Networks[J].International Journal of Automation and Computing, 2016, 13(03):226-234
|