Channel Separable Convolutional Neural Network for Action Recognition

Yi Ziwen; Sun Zhonghua; Feng Jinchao; Jia Kebin

doi:10.16798/j.issn.1003-0530.2020.09.015

Yi Ziwen, Sun Zhonghua, Feng Jinchao, Jia Kebin. Channel Separable Convolutional Neural Network for Action Recognition[J]. JOURNAL OF SIGNAL PROCESSING, 2020, 36(9): 1497-1502. DOI: 10.16798/j.issn.1003-0530.2020.09.015

Citation:

Channel Separable Convolutional Neural Network for Action Recognition

Graphical Abstract

Abstract

Abstract

3D convolutional neural network has superior ability in spatio-temporal feature extraction than 2D convolutional neural network, but the calculation intensity is significantly increased. To solve the problem of declined precision caused by reducing the computing complexity, the efficient compression of the model parameters is the key. Hence, an end-to-end channel separable convolutional neural network is proposed. 3D convolution is decomposed by separating channel interaction and spatio-temporal interaction, in which 3×3×3 Depthwise convolution and 1×1×1 conventional convolution are respectively used to separate channel interaction and spatio-temporal interaction. Compared with the traditional 3D convolutional neural network, the channel separable convolutional neural network adds model regularization, which reduces the overfitting of the model by reducing the training accuracy and improving the testing accuracy. Experiments on UCF-101 and HMDB-51 datasets have achieved 92.7% and 64.5% accuracy, respectively. The results show that the channel separable convolutional neural network can improve the accuracy and reduce the computational complexity.

FullText(HTML)

References (16)

Supplements (0)

Cited By

Channel Separable Convolutional Neural Network for Action Recognition

Abstract

Catalog

Export File

Citation

Format

Content