LVCSR系统中一种基于区分性和自适应瓶颈深度置信网络的特征提取方法

A Feature Extraction Method Based on Discriminative and Adaptive Bottleneck Deep Belief Network in Large Vocabulary Continuous Speech Recognition System

  • 摘要: 大词汇量连续语音识别系统中,为了进一步增强网络的鲁棒性、提升瓶颈深度置信网络的识别准确率,本文提出一种基于区分性和自适应瓶颈深度置信网络的特征提取方法。该方法首先使用鲁棒性较强的瓶颈深度置信网络进行初步特征提取,进而进行区分性训练,使网络的区分性更强、识别准确率更高,在此基础上引入说话人自适应技术对网络进行调整,提高系统的鲁棒性。本文利用提出的声学特征在多个噪声较强、主题风格较为随意的多个公共连续语音数据库上进行了测试,识别准确率取得了6.9%的提升。实验结果表明所提出的特征提取方法相对于传统方法的优越性。

     

    Abstract: In order to further improve the robustness and recognition rate of bottleneck deep belief network in Large Vocabulary Continuous Speech Recognition system, this paper presented a novel bottleneck deep belief network to extract new features which was based on speaker adaptation and discriminative training. Firstly, a bottleneck deep belief network was adopted to get the feature,thus discriminative training performed on this basis which gave a more distinguished network to improve the recognition accuracy. Simultaneously, a more robust speaker adaptation method was introduced to adjust the network. The proposed method was tested on several public continuous speech databases with strong noise and casual themes and a relative 6.9% promotion of the recognition accuracy was obtained. The result proves the superiority of the proposed method compared to the conventional one.

     

/

返回文章
返回