Abstract:
A sparse convolutive non-negative matrix factorization method is proposed based on asymmetric cost function. The method utilizes the Itakura-Saito distance as the objective cost function to measure the error between a target matrix and its reconstruction version, making the smaller matrix element have a smaller reconstruction error, and the cost function has the property of scale invariant. In order to evaluate its advantage in the aspect of weak spectrum component reconstruction, whispered speech basis and its coefficients are derived by the proposed algorithm, and then they are used to reconstruct the whispered speech. Experimental results show that the proposed algorithm has a better reconstructive performance for weak speech component than that based on Euclidean distance and Kullback-Leibler (K-L) divergence. The reconstructed speech signal gains larger intelligibility improvement by the proposed method.