A speech conversion method based on the separation of speaker-specific characteristics

MA Zhen; ZHANG Xiong-Wei; YANG Ji-Bin

MA Zhen, ZHANG Xiong-Wei, YANG Ji-Bin. A speech conversion method based on the separation of speaker-specific characteristics[J]. JOURNAL OF SIGNAL PROCESSING, 2013, 29(4): 513-519.

Citation:

MA Zhen, ZHANG Xiong-Wei, YANG Ji-Bin. A speech conversion method based on the separation of speaker-specific characteristics[J]. JOURNAL OF SIGNAL PROCESSING, 2013, 29(4): 513-519.

Citation:

MA Zhen, ZHANG Xiong-Wei, YANG Ji-Bin. A speech conversion method based on the separation of speaker-specific characteristics[J]. JOURNAL OF SIGNAL PROCESSING, 2013, 29(4): 513-519.

A speech conversion method based on the separation of speaker-specific characteristics

Graphical Abstract

Graphical Abstract

Abstract

Abstract

This paper aims to study independent and complete characterization of speaker-specific voice characteristics. Based on this, from the point of information separation, we will conduct a method on the separation between voice characteristics and linguistic content in speech, and carry out voice conversion. In this paper, we take full account of the K-SVD algorithm which can train the dictionary contains the personal characteristics and inter-frame correlation of voice. With this feature, the dictionary which contains the personal characteristics is extracted from training data through the K-SVD algorithm. Then we use the trained dictionary and other content information to reconstruct the target speech. Compared to traditional methods, the personal characteristics can be better preserved based on the proposed method through the sparse nature of voice and can easily solve the problems encountered in feature mapping methods and the voice conversion improvements are to be expected. Experimental results using subjective evaluations show that the proposed method outperforms the Gaussian Mixture Model and Artificial Neural Network based methods in the view of both speech quality and conversion similarity to the target.

FullText(HTML)

References (0)

Cited By

A speech conversion method based on the separation of speaker-specific characteristics

Graphical Abstract

Abstract

Catalog

Export File

Citation

Format

Content