REN Yanzhen, LIU Chenyu, LIU Wuyang, WANG Lina. A survey on speech forgery and detection[J]. JOURNAL OF SIGNAL PROCESSING, 2021, 37(12): 2412-2439. DOI: 10.16798/j.issn.1003-0530.2021.12.011
Citation: REN Yanzhen, LIU Chenyu, LIU Wuyang, WANG Lina. A survey on speech forgery and detection[J]. JOURNAL OF SIGNAL PROCESSING, 2021, 37(12): 2412-2439. DOI: 10.16798/j.issn.1003-0530.2021.12.011

A survey on speech forgery and detection

  • Voice carries human language and speaker identity information. Through voice spoofing technology, the voice of the target speaker can be accurately imitated to achieve the purpose of deceiving human or machine hearing. At present, Deepfake is posing a great threat to the global politics, economy and social stability. Voice spoofing is one of the core technologies for Deepfake to achieve public opinion manipulation. In recent years, voice forgery technology has made significant progress in anthropomorphism and naturalness, making voice forgery detection technology face greater challenges. This article reviews the current mainstream voice forgery and fake voice detection technology research status, mainly including: 1) A summary of the basic concepts, technological development and research progress of mainstream voice forgery technologies, including voice synthesis, voice conversion and voice countermeasure samples 2) A summary of the basic concepts, performance evaluation indicators, main technical implementation principles and performance effects of fake voice detection technology; 3) An introduction to mainstream competitions, commonly used data sets and available source code as well as tool resources related to fake voice detection. Finally, this paper discusses the existing challenging problems and the future research direction of speech forgery and detection technology.
  • loading

Catalog

    /

    DownLoad:  Full-Size Img  PowerPoint
    Return
    Return