Single Channel Speech Enhancement Based on Dual-path Recurrent Neural Network

WANG Zhijie; ZHANG Xueliang

doi:10.16798/j.issn.1003-0530.2021.10.010

WANG Zhijie, ZHANG Xueliang. Single Channel Speech Enhancement Based on Dual-path Recurrent Neural Network[J]. JOURNAL OF SIGNAL PROCESSING, 2021, 37(10): 1872-1879. DOI: 10.16798/j.issn.1003-0530.2021.10.010

Citation:

Single Channel Speech Enhancement Based on Dual-path Recurrent Neural Network

Graphical Abstract

Abstract

Abstract

In recent years, speech enhancement has improved significantly with the application of neural networks. However, for long-sequence speech with strong relevance, single network structure may not be able to continue to improve the enhancement effect due to its own performance limitations. To further improve the effect of neural networks on speech enhancement, this paper applied a composite network structure called dual-path recurrent neural network (DPRNN) to speech enhancement tasks. The composite network structure consists of convolutional neural network (CNN) and long short-term memory (LSTM), the core is a dual-path recurrent neural network block (DPRNN Block) composed of two LSTMs. DPRNN splits the long sequence of speech data into overlapping frames data chunks and performs intra- and inter-chunk calculations on these chunks using DPRNN Blocks to achieve local and global data modeling. The experimental result shows, compared with single network structure, DPRNN achieves the best results in both trained noise and untrained noise conditions.

FullText(HTML)

References (27)

Supplements (0)

Cited By

Single Channel Speech Enhancement Based on Dual-path Recurrent Neural Network

Abstract

Catalog

Export File

Citation

Format

Content