结合多任务迁移学习与知识蒸馏的人脸美丽预测研究
Research for Facial Beauty Prediction Combined Multi-Task Transfer Learning with Knowledge Distillation
-
摘要: 目前,人脸美丽预测存在数据样本少、评价指标不明确和人脸外观变化大等问题。多任务迁移学习能有效利用相关任务和源域任务额外的有用信息,知识蒸馏可将教师模型的部分知识蒸馏到学生模型,降低模型复杂性和大小。本文将多任务迁移学习与知识蒸馏相结合,用于人脸美丽预测,以大规模亚洲人脸美丽数据库(Large Scale Asia Facial Beauty Database, LSAFBD)中人脸美丽预测为主任务,以SCUT-FBP5500数据库中性别识别为辅任务。首先,构建多输入多任务的人脸美丽教师模型和学生模型;其次,训练多任务教师模型并计算其软目标;最后,结合多任务教师模型的软目标和学生模型的软、硬目标进行知识蒸馏。实验结果表明,多任务教师模型在人脸美丽预测任务中取得6823%的准确率,其结构较复杂,参数量达14793K;而多任务学生模型通过知识蒸馏后分类准确率为6739%,但其结构简单、参数量仅1366K。本方法多任务教师模型分类准确率比其他方法高,多任务学生模型分类准确率虽然略低一点,但其模型更简单、参数量更少,更有利于用更轻量的网络模型进行人脸美丽预测。
Abstract: At present, there are some problems in facial beauty prediction, such as few data samples, unclear evaluation index and large change of face appearance. Multi-task transfer learning can effectively utilize the additional useful information of related tasks and source domain tasks. Knowledge Distillation can distill some knowledge of teacher model into student model, and reduce the complexity and size of model. In this paper, multi-task transfer learning and knowledge distillation are combined for facial beauty prediction, in which facial beauty prediction using Large Scale Asian Facial Beauty Database (LSAFBD) is the main task and gender identification in SCUT-FBP5500 database is regarded as the auxiliary task. Firstly, multi-input multi-task facial beauty teacher model and student model are constructed. Secondly, we trained the multi-task teacher model and calculated its soft targets. Finally, knowledge distillation is carried out by combining the soft targets of the multi-task teacher model and the soft and hard targets of the student model. Experimental results show that the multi-task teacher model achieves an accuracy of 68.23% in the facial beauty prediction, which has a complicated structure and a parameter of 14793K. Although the multi-task student model achieves an accuracy of 67.39% after knowledge distillation, its structure is simple and parameter is only 1366K. The classification accuracy of the multi-task teacher model is higher than that of other methods. Although the classification accuracy of the multi-task student model is slightly lower, the model is simpler and the network parameters are less. It is more advantageous to use the lighter network to predict the facial beauty.