Advances in speaker segmentation and clustering

MA Yong, BAO Chang-chun. Advances in speaker segmentation and clusteringJ. JOURNAL OF SIGNAL PROCESSING, 2013, 29(9): 1190-1199.

Citation:

MA Yong, BAO Chang-chun. Advances in speaker segmentation and clusteringJ. JOURNAL OF SIGNAL PROCESSING, 2013, 29(9): 1190-1199.

Citation:

MA Yong, BAO Chang-chun. Advances in speaker segmentation and clusteringJ. JOURNAL OF SIGNAL PROCESSING, 2013, 29(9): 1190-1199.

Abstract

Speaker segmentation and clustering, which are focused on the determination of the starting and ending time points in multi-speaker audio flows and labeling the speech signal segments with labels corresponding to the identity speaker, have gradually become a hotspot in the field of speech signal processing in the recent years. It plays an important role in automatic speech recognition (ASR), multi-speaker recognition and content-based audio signals analysis. Based on the different implementation processes used in the speaker segmentation and clustering, this paper gives a detailed review of the state-of-the-art algorithms, techniques and typical systems proposed in the past decade from the aspects of asynchronous and synchronous strategies. And the performances of the typical systems are compared through the NIST Rich Transcription (RT) evaluations in recent years. The existing problems are discussed and the future prospects of this research area are also described at the end.

FullText(HTML)

Export File