XU You-Liang, ZHANG Lian-Hai, ZHANG Wen-Lin, LI Yong-Bin. A Speaking Rate Adaptation Technique and phonological Attribute Posterior for Phone Recognition[J]. JOURNAL OF SIGNAL PROCESSING, 2012, 28(2): 295-300.
Citation: XU You-Liang, ZHANG Lian-Hai, ZHANG Wen-Lin, LI Yong-Bin. A Speaking Rate Adaptation Technique and phonological Attribute Posterior for Phone Recognition[J]. JOURNAL OF SIGNAL PROCESSING, 2012, 28(2): 295-300.

A Speaking Rate Adaptation Technique and phonological Attribute Posterior for Phone Recognition

More Information
  • Received Date: October 08, 2011
  • Revised Date: January 05, 2012
  • Published Date: February 24, 2012
  • The event detection-based method has become state of the art technique in Automatic Speech Recognition (ASR).The differences in speaking rate may impair the adaptation ability of acoustical models, On account of this, A novel adaptation algorithm is proposed in this paper, which adjust the frame and step size in the front end of the system with the cell of one utterance, after adaptation, the speaking rate consistent with the average rate of the speech corpus and decreasing it’s effect in model training. In addition, this method calculates the angle between vectors of the posterior probability to get the speed of the testing set, which eased the burden of system compared to that by training models. The algorithm was used in the pre-processing before the phonological features detection stage, and then with the nonlinear transformation, we put them as the observation of Hidden Markov Models based phone recognition systems. After the adaptation approach, the average frame of one phone in an utterance becomes constant and the dynamic range decreases, therefore the phoneme classification rate increase about 1.3%.
  • Related Articles

    [1]LI Yiting, QU Dan, YANG Xukui, ZHANG Hao, SHEN Xiaolong. Speech Recognition Model Based on Improved Linear Attention Mechanism[J]. JOURNAL OF SIGNAL PROCESSING, 2023, 39(3): 516-525. DOI: 10.16798/j.issn.1003-0530.2023.03.014
    [2]SANG Jiangkun, NURMEMET Yolwas. Compression Optimization Strategy for End-to-End ASR Model Based on Conformer[J]. JOURNAL OF SIGNAL PROCESSING, 2022, 38(12): 2639-2649. DOI: 10.16798/j.issn.1003-0530.2022.12.018
    [3] XU Fan, YANG Jianfeng, YAN Weizhi, WANG Mingwen. An End-to-End Dialect Speech Recognition Model Based on Self Attention[J]. JOURNAL OF SIGNAL PROCESSING, 2021, 37(10): 1860-1871. DOI: 10.16798/j.issn.1003-0530.2021.10.009
    [4]HU Wenxuan, WANG Qiulin, LI Song, HONG Qingyang, LI Lin. Multi-lingual Speech Recognition Research based on End-to-end Model[J]. JOURNAL OF SIGNAL PROCESSING, 2021, 37(10): 1816-1824. DOI: 10.16798/j.issn.1003-0530.2021.10.004
    [5]QIU Yi, JIA Gui-min, YANG Jin-feng, LIU Yuan-qing. Speech Recognition Model in Civil Aviation's Radiotelephony Communication Based on BiLSTM Neural Networks[J]. JOURNAL OF SIGNAL PROCESSING, 2019, 35(2): 293-300. DOI: 10.16798/j.issn.1003-0530.2019.02.015
    [6]GAO Zhen-Zhen, BAO Chang-Chun. MFS-HMM Speech Enhancement with the Matched Energy[J]. JOURNAL OF SIGNAL PROCESSING, 2016, 32(8): 937-944. DOI: 10.16798/j.issn.1003-0530.2016.08.08
    [7]QIN Yin-xue, LI Hai-feng, MA Lin. Research of image recognition method based on reading cognitive model[J]. JOURNAL OF SIGNAL PROCESSING, 2013, 29(11): 1526-1532.
    [8]HUANG Cheng-Wei, JIN Yun, BAO Yong-Qiang, YU Hua, ZHAO Li. Whispered Speech Emotion Recognition Embedded with Markov Networks and Multi-Scale Decision Fusion[J]. JOURNAL OF SIGNAL PROCESSING, 2013, 29(1): 98-106.
    [9]JIANG Ying, YU Yi-Biao. Robust Speech Recognition Using Histogram Equalization of Classified Features[J]. JOURNAL OF SIGNAL PROCESSING, 2011, 27(6): 896-900.
    [10]LU Yong, WU Zhen-Yang. Maximum Likelihood Subband Linear Regression for Robust  Speech Recognition[J]. JOURNAL OF SIGNAL PROCESSING, 2010, 26(1): 74-79.

Catalog

    Article Metrics

    Article views (672) PDF downloads (1505) Cited by()
    Related

    /

    DownLoad:  Full-Size Img  PowerPoint
    Return
    Return