Feature Fusion Based on DBN for Cross-corpus speech emotion recognition

ZHANG Xin-ran; JU Xiao-zheng; SONG Peng; ZHA Cheng; ZHAO Li

doi:10.16798/j.issn.1003-0530.2017.05.001

ZHANG Xin-ran, JU Xiao-zheng, SONG Peng, ZHA Cheng, ZHAO Li. Feature Fusion Based on DBN for Cross-corpus speech emotion recognition[J]. JOURNAL OF SIGNAL PROCESSING, 2017, 33(5): 649-660. DOI: 10.16798/j.issn.1003-0530.2017.05.001

Citation:

Feature Fusion Based on DBN for Cross-corpus speech emotion recognition

Graphical Abstract

Abstract

Abstract

In cross-corpus speech emotion recognition, the feature fusion on multi-scale is the current technical difficulties. Based on the Deep Belief Nets (DBN) in the field of Deep Learning, a method based on feature level fusion for the cross-corpus SER is proposed. According to the foregoing feature abstraction research, the emotional traits hiding in speech spectrum diagram (spectrogram) are obtained as image features, which are implemented feature fusion with the traditional emotion features. In cross-corpus speech emotion recognition, the feature fusion on multi-scale is the current technical difficulties. First based on the spectrogram analysis by STB/Itti model, the new spectrogram features are extracted from the color, the brightness and the direction, respectively; Then use modified DBNs fuse the traditional and the spectrogram features, which increase the scale of the feature subset and the characterization ability of emotion. Through the experiment on ABC database and Chinese corpus, the new feature subset is compared with traditional speech emotion features, while the recognition result on cross-corpus gains a obvious advances.

FullText(HTML)

References (0)

Supplements (0)

Cited By

Feature Fusion Based on DBN for Cross-corpus speech emotion recognition

Abstract

Catalog

Export File

Citation

Format

Content