基于语音的抑郁检测研究综述
Survey on Depression Detection Research Based on Speech Signals
-
摘要: 抑郁症作为一种常见的心理健康问题,严重影响人们的日常生活甚至是生命安全,对抑郁症及抑郁情绪的检测具有重要意义。抑郁检测的常用模态包括脑电、图像、文本和语音等,其中语音信号具有易获取和使用限制少的优势,基于语音的抑郁检测研究也因此成为当前的研究热点。本文对近几年基于语音的抑郁检测的最新进展进行综述。首先介绍了目前研究中所常用的抑郁语音数据集,对其中数据不平衡问题的处理方法进行了概括分析。然后对抑郁检测中常用的韵律特征、音质特征、基于谱的特征等语音特征进行了概述,并对特征的特点展开分析。另一方面,针对抑郁检测研究中所遇到的数据量少的问题,从数据增强、度量学习、元学习和迁移学习四个方面,简述了目前主流的小样本学习方法。考虑到抑郁语音数据的隐私性问题,介绍了基于联邦学习的语音抑郁检测研究,从数据安全性和边缘设备部署两方面做了具体陈述。最后,针对基于语音的抑郁检测研究现状和难点问题进行了总结与展望。Abstract: As a common mental health problem, depression seriously affects people’s daily life and even life safety. The detection of depression and depressive mood is meaningful. The common modes of depression detection include EEG, image, text and speech, among which speech signal has the advantages of easy acquisition and less restrictions on use. Therefore, speech-based depression detection research has become a current research hotspot. This paper reviews the latest progress in the field of speech-based depression detection in recent years. Firstly, the depression speech data sets commonly used in current research are introduced, and the methods to deal with the problem of data imbalance are summarized and analyzed. Then, the prosodic features, voice quality features, spectrum-based features and other speech features commonly used in depression speech recognition are summarized, and the characteristics of the features are analyzed. On the other hand, aiming at the problem of small amount of data encountered in depression detection research, the current mainstream few-shot learning methods are briefly described from four aspects: data enhancement, metric learning, meta-learning and transfer learning. Considering the privacy of depressive speech data, the research of depressive speech detection based on federated learning is also introduced, and the data security and edge device deployment are described in detail. Finally, the research status and difficult problems of speech-based depression detection are summarized and prospected.