考虑情感程度相对顺序的维度语音情感识别

韩文静; 李海峰; 马琳

考虑情感程度相对顺序的维度语音情感识别

Considering relative order of emotional degree in dimensional speech emotion recognition

摘要

摘要: 维度语音情感识别(Dim-SER)是情感计算领域的一个新兴分支，它从多维、连续的角度看待情感，将SER问题建模为连续值的预测回归任务。当前的Dim-SER系统在进行情感预测时缺少对语料间情感程度相对顺序的考虑，严重影响了人机交互系统对说话人情感变化趋势的把握。从该需求出发，本文以人类情感认知特性为参照，构建了一个对情感程度相对顺序敏感的Dim-SER系统，并引入Gamma统计对SER系统性能评价标准加以完善。系统构建过程中，本文构造了Top-rank概率分布对语料间的情感顺序进行描述，并使用Kullback-Leibler距离对预测造成的顺序一致性损失进行度量，最后提出顺序敏感的神经网络算法实现系统预测损失的最小化。情感预测实验结果表明，同常用的k近邻算法和支持向量回归算法相比，该系统有效地提高了语料间情感程度相对顺序的正确性。

Abstract: Dimensional speech emotion recognition (Dim-SER) is a rising branch of emotion computing field. It views emotion from dimensional and continuous perspective, and formalizes the SER problem as a regression task. Current Dim-SER researches never consider the relative order of emotional degree between utterances, which can make the human-machine interface get wrong information about speaker’s emotion variation trend. Starting from this demand, this paper constructs a relative order of emotion degree sensitive Dim-SER system with the human emotion cognitive characteristics as reference, and employs Gamma statistic to evaluate emotion recognition performance. Specifically, the Top-rank probability distribution is developed to describe the emotional ordering of utterances, and the Kullback-Leibler divergence is used to measure the loss of order consistency caused by emotion recognition. Finally, the Order-Senstive Network (OSNet) algorithm is proposed to minimized prediction loss. Experimental results show that, compared with the commonly used k-Nearest Neighbor (k-NN) and Support Vector Regression (SVR) approaches, the proposed system effectively improve the correctness of emotional relative order between utterances.

HTML全文

参考文献(0)

施引文献

资源附件(0)