CHEN Liang, SHAO Yubin, LONG Hua, DU Qingzhi, PENG Yi, TANG Weikang. Language Identification for Broadcasting Signal Based on Time-domain Gammatone Filtering Features[J]. JOURNAL OF SIGNAL PROCESSING, 2022, 38(3): 599-608. DOI: 10.16798/j.issn.1003-0530.2022.03.018
Citation: CHEN Liang, SHAO Yubin, LONG Hua, DU Qingzhi, PENG Yi, TANG Weikang. Language Identification for Broadcasting Signal Based on Time-domain Gammatone Filtering Features[J]. JOURNAL OF SIGNAL PROCESSING, 2022, 38(3): 599-608. DOI: 10.16798/j.issn.1003-0530.2022.03.018

Language Identification for Broadcasting Signal Based on Time-domain Gammatone Filtering Features

  • A speech time-domain filtering method is proposed for the broadcast language identification problem, where the gammatone time-domain function is used to convolutionally filter the pre-processed speech signal, and the windowing and signal energy logarithmizing are then used to find the time-domain gammatone filterbank features in each separate frame. After that, the feature parameters are represented pictorially. With the obtained feature parameters, the language identification experiments are carried out by VGG19 and Resnet34 classification networks. The automatic color scale algorithm is also used to denoise the imaged feature parameters of noise-added speech and to compare the effect of different dimensional feature parameters and different noise types and signal-to-noise ratios on the performance of language identification accuracy. The results show that the language recognition accuracy with the proposed feature parameters is higher than that with the traditional GFCC feature, GFCC-D-A feature, GFCC-SDC feature and Fbank feature, and the language identification accuracy is also improved in different noise types and different signal-to-noise ratios under broadcast speech identification scenarios.
  • loading

Catalog

    /

    DownLoad:  Full-Size Img  PowerPoint
    Return
    Return