Abstract:
With the increasing parallel computing power and the increasing amount of audio data, audio scene recognition has become one of the important research contents in the field of scene understanding. In order to solve the problems of difficult modeling and low accuracy of audio scene recognition, a Paralleling Convolutional Recurrent Neural Network algorithm model with multi-optimization mechanism is proposed in this paper. First of all, the audio signal is preprocessed and converted into a Mel spectrogram of a certain size, and then input into the network model for full spatial and temporal feature learning, and finally recognition. In order to verify the effectiveness of the model, the recognition performance test is carried out on the DCASE2019 audio scene data set. The results show that the accuracy of the algorithm model for audio scene recognition can reach 88.84%, which is better than the traditional network model, indicating the effectiveness of the algorithm model for audio scene recognition.