GAN Junying, WU Bicheng, ZOU Qi, ZHENG Zexin, MAI Chaoyun, ZHAI Yikui, HE Guohui. Dual-input Dual-task Attention Network Incorporating Noisy Label Correction Mechanism for Facial Beauty Prediction[J]. JOURNAL OF SIGNAL PROCESSING, 2022, 38(10): 2124-2133. DOI: 10.16798/j.issn.1003-0530.2022.10.013
Citation: GAN Junying, WU Bicheng, ZOU Qi, ZHENG Zexin, MAI Chaoyun, ZHAI Yikui, HE Guohui. Dual-input Dual-task Attention Network Incorporating Noisy Label Correction Mechanism for Facial Beauty Prediction[J]. JOURNAL OF SIGNAL PROCESSING, 2022, 38(10): 2124-2133. DOI: 10.16798/j.issn.1003-0530.2022.10.013

Dual-input Dual-task Attention Network Incorporating Noisy Label Correction Mechanism for Facial Beauty Prediction

  • ‍ ‍Facial beauty prediction is a leading research topic that studies the ability of computers to predict facial beauty similar to that of humans, and currently suffers from insufficient supervisory information, whose models are susceptible to noisy labels. Multi-Task Attention Network (MTAN) utilized a single database with multiple label types for supervised training, but ignored the fact that multiple databases with only one label type did not work well when trained for multiple tasks. The noisy label correction mechanism corrected the noisy labels by comparing the maximum prediction probability with the corresponding prediction probability of the labels. To this end, this paper presented the Dual-Input Dual-Task Attention Network (DIDTAN) in conjunction with MTAN, and incorporated a noise label correction mechanism. In this paper, the supervised information of two single-label type facial beauty databases could be used by DIDTAN simultaneously, thus solving the problem of insufficient supervised information; a noisy label correction mechanism was incorporated by DIDTAN to solve the influence of noisy labels and the accuracy of facial beauty prediction was improved. Batch Normalization (BN) layer shared by tasks in MTAN was extended to different task-specific BN layers in DIDTAN; Neural Discriminative Dimensionality Reduction (NDDR) module was introduced to constrain the expression of shallow features. At the same time, the Deep CORrelation Alignment (Deep CORAL) loss function was used to constrain the expression of fully connected layer features; and noise labels were corrected by a noise label correction mechanism. Experiments on Large Scale Facial Beauty Database (LSFBD), SCUFBP-5500 database and CelebA database showed that the dual-input and dual-task facial beauty prediction based on LSFBD and SCUFBP-5500 database achieved 65.4% prediction accuracy, higher than the highest accuracy of conventional methods. The method presented can achieve dual-input dual-task training and solve the influence of noisy labels, which improves the accuracy of facial beauty prediction and can be widely applied in other dual-input dual-task scenarios whose noisy labels exist.
  • loading

Catalog

    /

    DownLoad:  Full-Size Img  PowerPoint
    Return
    Return