HU Wenxuan, WANG Qiulin, LI Song, HONG Qingyang, LI Lin. Multi-lingual Speech Recognition Research based on End-to-end Model[J]. JOURNAL OF SIGNAL PROCESSING, 2021, 37(10): 1816-1824. DOI: 10.16798/j.issn.1003-0530.2021.10.004
Citation: HU Wenxuan, WANG Qiulin, LI Song, HONG Qingyang, LI Lin. Multi-lingual Speech Recognition Research based on End-to-end Model[J]. JOURNAL OF SIGNAL PROCESSING, 2021, 37(10): 1816-1824. DOI: 10.16798/j.issn.1003-0530.2021.10.004

Multi-lingual Speech Recognition Research based on End-to-end Model

  • End-to-end speech recognition models didn’t require pronunciation dictionaries during training and could significantly reduce the burden of developing speech recognition systems for new languages. This paper exploits this advantage of end-to-end models to build a language-independent end-to-end multilingual speech recognition system. The model is trained using a character-based approach, in which a multilingual output symbol set is constructed so that it includes characters that occur in all target languages. The model is trained to generate a single model with network parameters shared by all languages. The multilingual speech recognition system proposed in this paper performs better on all languages compared to the monolingual speech recognition system on the 10 languages dataset provided by the Oriental Language Recognition (OLR) Challenge.
  • loading

Catalog

    /

    DownLoad:  Full-Size Img  PowerPoint
    Return
    Return