听觉注意模型的语谱图语音情感识别方法

Spectrogram Speech Emotion Recognition Method Based on Auditory Attention Model

  • 摘要: 在语音情感识别技术中,由于噪声环境、说话方式和说话人特质原因,会造成实验数据库特征不匹配的情况。从语音学上分析,该问题多存在于跨数据库情感识别实验。训练的声学模型和用于测试的语句样本之间的错位,会使语音情感识别性能剧烈下降。本文据此所研究的选择性注意声学模型能有效探测变化的情感特征。同时,利用时频原子对模型进行改进,使之能提取跨语音数据库中的显著性特征用于情感识别。实验结果表明,利用文章所提方法在跨库情感样本上进行特征提取,再通过典型的分类器,识别性能提高了9个百分点,从而验证了该方法对不同数据库具有更好的鲁棒性。

     

    Abstract: When there exists mismatch between the trained acoustic models and the test utterances due to noise conditions, speaking styles and speaker traits, unmatched features may appear in cross-corpus. The resulting is the drastic degression in the performance of speech emotion recognition. Hence, the auditory attention model is found to be very effective for variational emotion features detection in our work. Therefore, Chirplet has been adopted to obtain salient gist features which show their relation to the expected performance in cross-corpus testing. In our experimental results, the prototypical classifier with the proposed feature extraction approach can deliver a gain of up to 9.6% accuracy in cross-corpus speech recognition, which is observed insensitive to different databases.

     

/

返回文章
返回