用于行为识别的通道可分离卷积神经网络

Channel Separable Convolutional Neural Network for Action Recognition

  • 摘要: 三维卷积神经网络比二维卷积神经网络具有更优越的时空特征提取能力,但运算量却显著增加。针对如何有效减少模型参数量、解决准确率随着计算复杂度降低而降低的问题,提出基于端到端的通道可分离卷积神经网络。通过分离通道交互作用和时空交互作用来分解三维卷积,其中分别利用3×3×3 Depthwise卷积和1×1×1常规卷积进行分离通道交互作用和时空交互作用。与传统三维卷积神经网络相比,通道可分离卷积神经网络加入模型正则化,通过降低训练精度同时提高测试精度,降低了模型的过度拟合。在UCF-101和HMDB-51数据集上的实验分别达到92.7%和64.5%的准确率。结果表明,通道可分离卷积神经网络可以提高准确率并降低计算复杂度。

     

    Abstract: 3D convolutional neural network has superior ability in spatio-temporal feature extraction than 2D convolutional neural network, but the calculation intensity is significantly increased. To solve the problem of declined precision caused by reducing the computing complexity, the efficient compression of the model parameters is the key. Hence, an end-to-end channel separable convolutional neural network is proposed. 3D convolution is decomposed by separating channel interaction and spatio-temporal interaction, in which 3×3×3 Depthwise convolution and 1×1×1 conventional convolution are respectively used to separate channel interaction and spatio-temporal interaction. Compared with the traditional 3D convolutional neural network, the channel separable convolutional neural network adds model regularization, which reduces the overfitting of the model by reducing the training accuracy and improving the testing accuracy. Experiments on UCF-101 and HMDB-51 datasets have achieved 92.7% and 64.5% accuracy, respectively. The results show that the channel separable convolutional neural network can improve the accuracy and reduce the computational complexity.

     

/

返回文章
返回