WANG Jinyang, HUA Guang, HUANG Shuang. End-to-end Synthetic Speech Detection Based on Attention Mechanism[J]. JOURNAL OF SIGNAL PROCESSING, 2022, 38(9): 1975-1987. DOI: 10.16798/j.issn.1003-0530.2022.09.020
Citation: WANG Jinyang, HUA Guang, HUANG Shuang. End-to-end Synthetic Speech Detection Based on Attention Mechanism[J]. JOURNAL OF SIGNAL PROCESSING, 2022, 38(9): 1975-1987. DOI: 10.16798/j.issn.1003-0530.2022.09.020

End-to-end Synthetic Speech Detection Based on Attention Mechanism

  • ‍ ‍In recent years the rapid development of deepfake technology has significantly improved the naturalness and personality of synthetic speech, which poses a greater challenge to the research of synthetic speech detection. In this paper, the mechanisms of five light-weight attention modules are incorporated and modified into channel attention mechanism and one-dimensional spatial attention mechanism suitable for speech sequence, and then the modules are embedded into Inc-TSSDNet respectively, establishing an end-to-end synthetic speech detection system based on attention mechanism. The results show that the improved system can focus on some channels or regions that are more critical to the detection of synthetic artifacts to improve the detection performance. Compared with the baseline model, the ten models with attention mechanism can effectively reduce the equal error rate (EER) and minimum tandem detection cost function (min t-DCF) on the evaluation set of ASVspoof2019 challenge, with a slight increase of the number of model parameters. Among them, the model embedded with CBAM (Convolutional Block Attention Module) before the pooling layer has the lowest EER and promising generalization capability, while the model embedded with ECA (Efficient Channel Attention) module before the pooling layer has the lowest min t-DCF and the statistical performance of the model is significantly improved compared with the baseline model.
  • loading

Catalog

    /

    DownLoad:  Full-Size Img  PowerPoint
    Return
    Return