语音情感识别综述

Review on Speech Emotion Recognition

  • 摘要: 语音情感识别是利用计算机建立语音信息载体与情感度量之间的关系,并赋予计算机识别、理解人类情感的能力,语音情感识别在人机交互中起着重要作用,是人工智能领域重要发展方向。本文从语音情感识别在国内外发展历史以及开展的一系列会议、期刊和竞赛入手,分别从6个方面对语音情感识别的研究现状进行了梳理与归纳:首先,针对情感表达从离散、维度模型进行了阐述;其次,针对现有的情感数据库进行了统计与总结;然后,回顾了近20年部分代表性语音情感识别发展历程,并分别阐述了基于人工设计的语音情感特征的情感识别技术和基于端到端的语音情感识别技术;在此基础之上,总结了近几年的语音情感识别性能,尤其是近两年在语音领域的重要会议和期刊上的语音情感识别相关工作;介绍了语音情感识别在驾驶、智能交互领域、医疗健康,安全等领域的应用;最后,总结与阐述了语音情感识别领域仍面临的挑战与未来发展方向。本文旨在对语音情感识别相关工作进行深入分析与总结,为语音情感识别相关研究者提供有价值的参考。

     

    Abstract: ‍ ‍Speech emotion recognition is a kind of technology that uses computers to create the relationship between speech and emotion measurement, and provides computers with the ability to recognize and understand human emotions. Therefore, speech emotion recognition plays an important role in human-computer interaction and is a promising development direction in the field of artificial intelligence. Starting from the development history of speech emotion recognition and a series of conferences and competitions at home and abroad, this paper reviews the current research status of speech emotion recognition from six aspects. Firstly, the discrete and dimensional models for emotional representation are described. Secondly, the current commonly used speech emotion databases are summarized in detail. Thirdly, representative speech emotion recognition development history is reviewed in the past 20 years, and the speech emotion recognition technology based on hand-crafted speech emotion features and end-to-end framework are described, respectively. Then, the performance of speech emotion recognition in recent years is summarized, especially the major conferences and journals in the speech signal field in the past two years. Then, the applications of speech emotion recognition in driving, intelligent interaction, medical health, safety and other fields are introduced. Finally, the challenges and trends in the field of speech emotion recognition are described from the three aspects, including speech emotion database, speech emotion features, and algorithms/models. This paper aims to analyze the related work of speech emotion recognition in detail and provides a valuable reference for researchers who are engaged in speech emotion recognition research filed.

     

/

返回文章
返回