Processing math: 100%

复广义高斯分布多通道最大似然联合去噪去混响波束形成器

孟维鑫, 厉剑, 郑成诗, 李晓东

孟维鑫, 厉剑, 郑成诗, 李晓东. 复广义高斯分布多通道最大似然联合去噪去混响波束形成器[J]. 信号处理, 2022, 38(4): 677-689. DOI: 10.16798/j.issn.1003-0530.2022.04.002
引用本文: 孟维鑫, 厉剑, 郑成诗, 李晓东. 复广义高斯分布多通道最大似然联合去噪去混响波束形成器[J]. 信号处理, 2022, 38(4): 677-689. DOI: 10.16798/j.issn.1003-0530.2022.04.002
MENG Weixin, LI Jian, ZHENG Chengshi, LI Xiaodong. Jointly Denoising and Dereverberation with Maximum Likelihood Beamformer Under Complex Generalized Gaussian Distribution[J]. JOURNAL OF SIGNAL PROCESSING, 2022, 38(4): 677-689. DOI: 10.16798/j.issn.1003-0530.2022.04.002
Citation: MENG Weixin, LI Jian, ZHENG Chengshi, LI Xiaodong. Jointly Denoising and Dereverberation with Maximum Likelihood Beamformer Under Complex Generalized Gaussian Distribution[J]. JOURNAL OF SIGNAL PROCESSING, 2022, 38(4): 677-689. DOI: 10.16798/j.issn.1003-0530.2022.04.002

复广义高斯分布多通道最大似然联合去噪去混响波束形成器

基金项目: 

国家自然科学基金青年基金 62001467

详细信息
    作者简介:

    孟维鑫 男,1997年生,山东东营人。中国科学院大学、中国科学院声学研究所博士生,主要研究方向为语音信号处理、阵列信号处理。E-mail:mengweixin@mail.ioa.ac.cn

    厉剑 男,1989年生,江苏扬州人。中国科学院声学研究所助理研究员,主要研究方向为语音信号处理、阵列信号处理。E-mail:lijian@mail.ioa.ac.cn

    郑成诗 男,1980年生,福建三明人。中国科学院声学研究所研究员,博士生导师,主要研究方向为语音信号处理、阵列信号处理以及机器学习。E-mail:cszheng@mail.ioa.ac.cn

    李晓东 男,1966年生,江苏扬州人。中国科学院声学研究所研究员,博士生导师,主要研究方向为音频/语音信号处理、主动噪声与振动控制、声与振动信号监测与分析、声学测量和计量等。E-mail:lxd@mail.ioa.ac.cn.

  • 中图分类号: TN911.7

Jointly Denoising and Dereverberation with Maximum Likelihood Beamformer Under Complex Generalized Gaussian Distribution

  • 摘要: 提出了一种基于复超高斯分布的多通道联合去噪去混响波束形成器。本文采用复超高斯模型对语音信号建模,在最大似然准则下首次推导出联合去噪去混响波束形成器的解析表达式,并证明了该式是现有多种联合去噪去混响波束形成器的一般化形式。同时通过理论推导证明本文所提算法性能优于传统多通道预测误差算法级联最小功率无失真波束形成器。仿真实验与实际实验结果均表明,本文提出的算法在多个客观指标上明显优于现有联合去噪去混响算法。
    Abstract: This paper proposes a jointly denoising and dereverberation beamformer based on a complex super-Gaussian distribution. By modelling speech using a complex super-Gaussian distribution, we first derive the optimal denoising and dereverberation beamformer with a maximum likelihood criterion. The paper further proves that the proposed beamformer can be regarded as a generalized framework of many existing jointly denoising and dereverberation methods and also demonstrates that the proposed beamformer outperforms the weighted prediction error algorithm cascaded minimum power distortionless beamformer theoretically. Simulation results and experimental results show that the proposed beamformer does outperform many state-of-the-art joint denoising and dereverberation algorithms in terms of several objective measurements.
  • 图  1   级联去噪去混响算法流程图

    Figure  1.   Cascade denoising and dereverberation algorithm processing flow

    图  2   不同迭代次数时的实验结果.(a)PESQ提升量; (b) ESTOI提升量; (c)SDR提升量; (d)SRMR提升量

    Figure  2.   Results for different number of iterations. (a) PESQ improvements; (b) ESTOI improvements; (c) SDR improvements; (d) SRMR improvements

    图  3   不同输入信干噪比的实验结果.(a)PESQ提升量; (b) ESTOI提升量; (c)SDR提升量; (d)SRMR提升量

    Figure  3.   Results for different input SINR values. (a) PESQ improvements; (b) ESTOI improvements; (c) SDR improvements; (d) SRMR improvements

    图  4   不同混响时间下的实验结果.(a)PESQ提升量; (b) ESTOI提升量; (c)SDR提升量; (d)SRMR提升量

    Figure  4.   Results for different reverberation time. (a) PESQ improvements; (b) ESTOI improvements; (c) SDR improvements; (d) SRMR improvements

    图  5   REVERB Challenge测试样例语谱图.(a)纯净语音; (b)噪声信号; (c)带噪语音; (d)WPE+MPDR; (e)WPD; (f)CGG-WPD

    Figure  5.   REVERB Challenge test sample speech spectrograms. (a) clean speech; (b) noise; (c) noisy speech; (d) WPE+MPDR; (e) WPD; (f) CGG-WPD

    表  1   三种算法计算复杂度分析

    Table  1   Computational complexity analysis of three algorithms

    方法计算复杂度M=6, Lw =10, b=4, I=5
    WPE+MPDRO(M3(Lw-b+1)3I)+O(M3)O(370656)
    WPDO((M(Lw-b+1)+M)3I)O(552960)
    CGG-WPDO((M(Lw-b+1)+M)3I)O(552960)
    下载: 导出CSV

    表  2   REVERB Challenge小型房间测试结果

    Table  2   Experiment results of small room in REVERB Challenge

    方法SINR=0 dBSINR=10 dB
    PESQESTOISDR/dBSRMR/dBPESQESTOISDR/dBSRMR/dB
    NOISY1.350.38-0.922.301.980.606.495.29
    WPE+MPDR1.370.39-1.272.502.170.658.006.20
    WPD2.090.564.685.322.910.8412.129.11
    CGG-WPD2.260.626.176.613.040.8512.309.35
    下载: 导出CSV

    表  3   REVERB Challenge中等房间测试结果

    Table  3   Experiment results of medium room in REVERB Challenge

    方法SINR=0 dBSINR=10 dB
    PESQESTOISDR/dBSRMR/dBPESQESTOISDR/dBSRMR/dB
    NOISY1.100.31-2.811.841.630.482.323.69
    WPE+MPDR1.300.40-0.622.652.090.667.596.37
    WPD1.910.645.726.312.700.8410.729.22
    CGG-WPD2.010.666.096.852.800.8510.969.47
    下载: 导出CSV

    表  4   REVERB Challenge大型房间测试结果

    Table  4   Experiment results of large room in REVERB Challenge

    方法SINR=0 dBSINR=10 dB
    PESQESTOISDR/dBSRMR/dBPESQESTOISDR/dBSRMR/dB
    NOISY1.120.28-3.461.671.570.421.233.13
    WPE+MPDR1.390.40-0.382.822.100.637.156.43
    WPD1.750.51-1.275.602.550.779.878.77
    CGG-WPD2.020.625.926.642.610.789.978.98
    下载: 导出CSV

    表  5   CHiME-3测试结果

    Table  5   Experiment results of CHiME-3

    方法BUSCAFPEDSTR
    PESQESTOIPESQESTOIPESQESTOIPESQESTOI
    NOISY2.400.731.900.561.810.542.030.67
    WPE+MPDR2.770.802.350.702.200.652.510.76
    WPD2.920.872.480.782.330.742.730.85
    CGG-WPD3.020.882.540.802.390.762.830.86
    下载: 导出CSV
  • [1] 潘超,黄公平,陈景东. 面向语音通信与交互的麦克风阵列波束形成方法[J]. 信号处理,2020,36(6):804- 815. doi:10.16798/j.issn.1003-0530.2020.06.002 doi: 10.16798/j.issn.1003-0530.2020.06.002

    PAN Chao,HUANG Gongping,CHEN Jingdong. Microphone array beamforming:An overview[J]. Journal of Signal Processing,2020,36(6):804- 815.(in Chinese). doi:10.16798/j.issn.1003-0530.2020.06.002 doi: 10.16798/j.issn.1003-0530.2020.06.002

    [2] 冷艳宏,郑成诗,李晓东. 功率比相关子带划分快速独立向量分析[J]. 信号处理,2019,35(8):1314- 1323.

    LENG Yanhong,ZHENG Chengshi,LI Xiaodong. Fast independent vector analysis using power ratio correlation-based bands partition[J]. Journal of Signal Processing,2019,35(8):1314- 1323.(in Chinese)

    [3]

    CAPON J. High-resolution frequency-wavenumber spectrum analysis[J]. Proceedings of the IEEE,1969,57(8):1408- 1418. doi:10.1109/proc.1969.7278 doi: 10.1109/proc.1969.7278

    [4] 郭翔宇,鄢社锋,王文侠. 基于迭代梯度方法的线性约束稳健Capon波束形成快速算法[J]. 信号处理,2021,37(5):712- 723.

    GUO Xiangyu,YAN Shefeng,WANG Wenxia. A fast algorithm for linear constrained robust capon beamforming based on iterative gradient method[J]. Journal of Signal Processing,2021,37(5):712- 723.(in Chinese)

    [5]

    GANNOT S,COHEN I. Speech enhancement based on the general transfer function GSC and postfiltering[J]. IEEE Transactions on Speech and Audio Processing,2004,12(6):561- 571. doi:10.1109/tsa.2004.834599 doi: 10.1109/tsa.2004.834599

    [6]

    ZHENG Chengshi,DELEFORGE A,LI Xiaodong,et al. Statistical analysis of the multichannel Wiener filter using a bivariate normal distribution for sample covariance matrices[J]. IEEE/ACM Transactions on Audio,Speech,and Language Processing,2018,26(5):951- 966. doi:10.1109/taslp.2018.2800283 doi: 10.1109/taslp.2018.2800283

    [7]

    CHO B J,LEE J M,PARK H M. A beamforming algorithm based on maximum likelihood of a complex Gaussian distribution with time-varying variances for robust speech recognition[J]. IEEE Signal Processing Letters,2019,26(9):1398- 1402. doi:10.1109/lsp.2019.2932848 doi: 10.1109/lsp.2019.2932848

    [8]

    HABETS E A P,BENESTY J. A two-stage beamforming approach for noise reduction and dereverberation[J]. IEEE Transactions on Audio,Speech,and Language Processing,2013,21(5):945- 958. doi:10.1109/tasl.2013.2239292 doi: 10.1109/tasl.2013.2239292

    [9]

    SCHWARZ A,KELLERMANN W. Coherent-to-diffuse power ratio estimation for dereverberation[J]. IEEE/ACM Transactions on Audio,Speech,and Language Processing,2015,23(6):1006- 1018. doi:10.1109/taslp.2015.2418571 doi: 10.1109/taslp.2015.2418571

    [10]

    NAKATANI T,YOSHIOKA T,KINOSHITA K,et al. Speech dereverberation based on variance-normalized delayed linear prediction[J]. IEEE Transactions on Audio,Speech,and Language Processing,2010,18(7):1717- 1731. doi:10.1109/tasl.2010.2052251 doi: 10.1109/tasl.2010.2052251

    [11]

    JUKIĆ A,WATERSCHOOT T VAN,GERKMANN T,et al. Multi-channel linear prediction-based speech dereverberation with sparse priors[J]. IEEE/ACM Transactions on Audio,Speech,and Language Processing,2015,23(9):1509- 1520. doi:10.1109/taslp.2015.2438549 doi: 10.1109/taslp.2015.2438549

    [12]

    DRUDE L,BOEDDEKER C,HEYMANN J,et al. Integrating neural network based beamforming and weighted prediction error dereverberation[C]// Interspeech 2018,Hyderabad,India. IEEE,2018:3043- 3047.

    [13]

    SONG Siyuan,CHENG Longbiao,LUAN Shuming,et al. An integrated multi-channel approach for joint noise reduction and dereverberation[J]. Applied Acoustics,2021,171(7):107526- 107534. doi:10.1016/j.apacoust.2020.107526 doi: 10.1016/j.apacoust.2020.107526

    [14]

    NAKATANI T,BOEDDEKER C,KINOSHITA K,et al. Jointly optimal denoising,dereverberation,and source separation[J]. IEEE/ACM Transactions on Audio,Speech,and Language Processing,2020,28(14):2267- 2282. doi:10.1109/taslp.2020.3013118 doi: 10.1109/taslp.2020.3013118

    [15]

    ERKELENS J S,HENDRIKS R C,HEUSDENS R,et al. Minimum mean-square error estimation of discrete Fourier coefficients with generalized gamma priors[J]. IEEE Transactions on Audio,Speech,and Language Processing,2007,15(6):1741- 1752. doi:10.1109/tasl.2007.899233 doi: 10.1109/tasl.2007.899233

    [16]

    PALMER J,WIPF D,KREUTZ-DELGADO K,et al. Variational EM algorithms for non-Gaussian latent variable models[J]. Advances in Neural Information Processing Systems,2006,18(5):1059- 1066.

    [17]

    ZUE V,SENEFF S,GLASS J. Speech database development at MIT:Timit and beyond[J]. Speech Communication,1990,9(4):351- 356. doi:10.1016/0167-6393(90)90010-7 doi: 10.1016/0167-6393(90)90010-7

    [18]

    VARGA A,STEENEKEN H J M. Assessment for automatic speech recognition:II. NOISEX-92:A database and an experiment to study the effect of additive noise on speech recognition systems[J]. Speech Communication,1993,12(3):247- 251. doi:10.1016/0167-6393(93)90095-3 doi: 10.1016/0167-6393(93)90095-3

    [19]

    ALLEN J B,BERKLEY D A. Image method for efficiently simulating small-room acoustics[J]. The Journal of the Acoustical Society of America,1979,65(4):943- 950. doi:10.1121/1.382599 doi: 10.1121/1.382599

    [20]

    RIX A W,BEERENDS J G,HOLLIER M P,et al. Perceptual evaluation of speech quality(PESQ)-a new method for speech quality assessment of telephone networks and codecs[C]// 2001 IEEE International Conference on Acoustics,Speech,and Signal Processing(ICASSP). Salt Lake City,UT. IEEE,2001:749- 752.

    [21]

    JENSEN J,TAAL C H. An algorithm for predicting the intelligibility of speech masked by modulated noise maskers[J]. IEEE/ACM Transactions on Audio,Speech,and Language Processing,2016,24(11):2009- 2022. doi:10.1109/taslp.2016.2585878 doi: 10.1109/taslp.2016.2585878

    [22]

    VINCENT E,GRIBONVAL R,FEVOTTE C. Performance measurement in blind audio source separation[J]. IEEE Transactions on Audio,Speech,and Language Processing,2006,14(4):1462- 1469. doi:10.1109/tsa.2005.858005 doi: 10.1109/tsa.2005.858005

    [23]

    FALK T H,ZHENG Chenxi,CHAN W Y. A non-intrusive quality and intelligibility measure of reverberant and dereverberated speech[J]. IEEE Transactions on Audio,Speech,and Language Processing,2010,18(7):1766- 1774. doi:10.1109/tasl.2010.2052247 doi: 10.1109/tasl.2010.2052247

    [24]

    MARKOVICH S,GANNOT S,COHEN I. Multichannel eigenspace beamforming in a reverberant noisy environment with multiple interfering speech signals[J]. IEEE Transactions on Audio,Speech,and Language Processing,2009,17(6):1071- 1086. doi:10.1109/tasl.2009.2016395 doi: 10.1109/tasl.2009.2016395

  • 期刊类型引用(4)

    1. 汤永涛,王雪宝,王青波,刘国强. 基于FastICA算法的多源固定频干扰背景下语音信号去噪. 电脑知识与技术. 2024(06): 77-79 . 百度学术
    2. 张家扬,何伟,童峰,卢荣富,冯万健. 基于角度压制比谱减的环境自适应双麦语音增强. 厦门大学学报(自然科学版). 2024(02): 296-304 . 百度学术
    3. 吴劲芳,齐骥,刘玉龙,王德伟,魏宏杰. 高温高湿环境下大规模海上风电机组零部件的防腐检测. 无损检测. 2024(09): 64-68 . 百度学术
    4. 庞凯元,刘桂峰,陈思余,夏菁. 混响背景下基于波束成形的声成像技术. 舰船科学技术. 2023(06): 133-139 . 百度学术

    其他类型引用(4)

图(5)  /  表(5)
计量
  • 文章访问数:  201
  • HTML全文浏览量:  42
  • PDF下载量:  68
  • 被引次数: 8
出版历程
  • 收稿日期:  2021-06-24
  • 刊出日期:  2022-04-24

目录

    /

    返回文章
    返回