基于Transformer和CNN特征融合的非接触式心率检测算法
Non-Contact Heart Rate Detection Algorithm Based on Transformer and CNN Feature Fusion
-
摘要: 心率失调是心血管疾病的常见危险因素,如何及时地对心率失调作出预警,成为了研究的热点。传统的心率检测是接触式的,检测十分不便。远程光电容积脉搏波描记法(Remote photoplethysmography, rPPG)的提出旨在实现无须接触的测量,在许多场景中具有巨大的应用价值。然而,非接触式的测量方法容易受到环境和受试者运动的影响。针对这些问题,本文提出了基于Transformer全局表达和CNN局部特征融合的非接触式心率检测模型(TC-Net)。本文搭建了一个并联的Transformer网络和CNN网络的TC-Net模型。本模型包括两个分支,CNN分支用于提取rPPG信号的局部区域特征,Transformer用于提取rPPG信号的全局表达。由于全局表达特征和局部特征两者特征长度不一,本文提出了特征交互模块,该模块通过卷积层和上下采样模块以对齐两个分支的特征长度。随后,TC-Net通过特征交互模块将局部特征和全局表达进行融合得到最终用于检测的特征向量。最后,在远程心率测量评估中最广泛使用基准之一MAHNOB-HCI数据集上进行对比验证,模型在误差偏差(MAE)、均方根误差(RMSE)和平均绝对百分比误差(MAPE)等三个指标上都达到了最佳水平。因此,本文提出的非接触式心率检测模型TC-Net基于Transformer全局表达和CNN局部特征融合,具有较高的准确性和鲁棒性。此外,TC-Net仅由简单的几层卷积层和注意力层搭建而成,模型复杂度低,便于后续投入实际应用。Abstract: Arrhythmia is a common risk factor for cardiovascular diseases, and how to give early warning of arrhythmia has become a research hotspot. The traditional heart rate detection is contact type, which is very inconvenient. Remote photoplethysmography (rPPG) is proposed to achieve non-contact measurement, which has great application value in many scenarios. However, the environment and subject motion easily affect non-contact measurement methods. In response to these problems, this paper proposed a non-contact heart rate detection model (TC-Net) based on Transformer global expression and CNN local feature fusion. This paper built a parallel Transformer network and a TC-Net model of the CNN network. This model included two branches, the CNN branch was used to extract the local area features of the rPPG signal, and the Transformer was used to remove the global expression of the rPPG signal. Since the feature lengths of global expression features and local features were different, this paper proposed a feature interaction module, which used convolutional layers and up-and-down sampling modules to align the feature lengths of the two branches. Then, TC-Net fused the local features and the global expression through the feature interaction module to obtain the final feature vector for detection. Finally, comparative validation was carried out on one of the most widely used benchmarks in remote heart rate measurement evaluation, the MAHNOB-HCI dataset. The indicators had reached the best level. Therefore, the non-contact heart rate detection model TC-Net proposed in this paper is based on Transformer global expression and CNN local feature fusion, which has high accuracy and robustness. In addition, TC-Net is only built with a few simple convolution layers and attention layers, and the model complexity is low, which is convenient for subsequent practical applications.