基于TVF-EMD的乐器音质特征分析方法及其应用

Timbre Feature Extraction of Musical Instrument Based on TVF-EMD and Its Application

  • 摘要: 音质(Timbre)是音乐感知和言语识别的重要线索。传统音质分析方法无法同时获取理想的时间分辨率和频域分辨率,对音频的非平稳特性没有很好地处理。本文采用时变滤波经验模态分解(Time Varying Filtering based EMD,TVF-EMD)方法提取音频的固有模态函数用于希尔伯特变换,并构建了音质的希尔伯特频谱分布特征和希尔伯特轮廓特征。在乐器分类问题中,将提取的两类音质特征与Mel倒谱系数特征(Mel Frequency Cepstral Coefficients, MFCCs)有效结合,然后构造基于双向长短时记忆网络的音质时序分类器,在公开乐器演奏音频数据库中进行了乐器分类实验。结果表明,所提出的音质特征可以有效补充Mel倒谱特征等传统特征无法表达的非线性非平稳信息,大大提高了本音质表征方法对复杂音频的适应性和鲁棒性。

     

    Abstract: Timbre is an important clue for music perception and speech recognition. The traditional feature extraction method cannot obtain the ideal temporal resolution and frequency resolution at the same time, and the non-stationary information of audio is not well explored. To solve the above problems, the time varying filtering based EMD (TVF-EMD) method was adopted in this paper to extract the intrinsic mode function of audio for the Hilbert Transform, and constructed the Hilbert spectrum distribution features and Hilbert contour features. In the experiment of musical instrument classification, we combined the two kinds of features with the Mel frequency cepstral coefficients (MFCCs), and then constructed a time sequence classifier based on Bi-directional Long Short-Term Memory (BiLSTM). The experiment of musical instrument classification was carried out in the open musical instrument performance audio database. The experimental results show that the proposed features can supplement the non-linear non-stationary information which is not extracted from the traditional features such as MFCCs, and improve the adaptability and robustness of timbre features to complex audio.

     

/

返回文章
返回