Abstract:
Through research on sound recognition technology in complex environments, this paper proposes an environmental sound recognition method that combines the Merrill Spectrum Coefficient (MFSC) and Convolutional Neural Network (CNN). The MFSC features of sound events are extracted, and the feature parameters are used as input to the designed CNN model to classify the sound events. The experimental data set uses ESC-10 to compare the constructed convolutional neural network model with three recognition models commonly used in random forest, support vector machine (SVM), deep neural network (DNN) and DCASE competitions. The experimental results show that, under the same data set, the environmental sound recognition method combining the Meir spectral coefficients and the convolutional neural network designed in this paper has a recognition rate of 13.1%, 18.3%, and 15.7, respectively, compared with the traditional sound recognition method. Compared with the three commonly used recognition models in the DCASE competition, the recognition rate and recognition efficiency of the recognition model designed in this paper have obvious advantages.