基于端到端的多语种语音识别研究

Multi-lingual Speech Recognition Research based on End-to-end Model

  • 摘要: 端到端语音识别模型无需发音词典进行训练,可以大幅降低开发新语种语音识别系统的负担。本文利用端到端模型的这一优势,建立了一种语种无关的端到端多语种语音识别系统。该模型使用基于字符的建模方法进行训练,同时构建多语种输出符号集,使其包括所有目标语言中出现的字符。模型训练生成单一模型,其网络参数为所有语种共享。在OLR竞赛提供的10个语种数据集上,相较于单语种语音识别系统,本文提出的多语种语音识别系统在所有语言上的表现都更加优秀。

     

    Abstract: End-to-end speech recognition models didn’t require pronunciation dictionaries during training and could significantly reduce the burden of developing speech recognition systems for new languages. This paper exploits this advantage of end-to-end models to build a language-independent end-to-end multilingual speech recognition system. The model is trained using a character-based approach, in which a multilingual output symbol set is constructed so that it includes characters that occur in all target languages. The model is trained to generate a single model with network parameters shared by all languages. The multilingual speech recognition system proposed in this paper performs better on all languages compared to the monolingual speech recognition system on the 10 languages dataset provided by the Oriental Language Recognition (OLR) Challenge.

     

/

返回文章
返回