Abstract:
Conventional speech enhancement methods which are always conducted with short time Fourier transform are based on the assumption that speech signals are short time stationary. This leads to imprecisely spectral estimation of noise and consequently residual noise and speech distortion are found in the enhanced speech. In this paper, we proposed a new speech enhancement method without the quasi stationary assumption. Noisy speech was first transformed to the joint time-frequency by fast Real-Valued Discrete Gabor Transform (RDGT) in which the Gaussian window was used as the transform kernel because of its superior local energy assembling. The MMSE based log amplitude estimator of speech was derived under speech presence uncertainty hypothesis and the assumption that speech and noise data are statistically independent Gaussian random variables. The noise spectral was estimated by improved minima controlled recursive averaging (IMCRA) algorithm. The clean speech estimation was got by inverse transform of RDCT. Experimental results showed that the proposed method is very effective in avoiding the musical residual noise and retaining weak speech components.