基于多分区时空图卷积网络的骨骼动作识别
Multi-partitioned Spatiotemporal Graph Convolutional Network for Skeletal Action Recognition
-
摘要: 人体骨骼点数据相对于RGB视频数据具有更好的环境适应性和动作表达能力,因此基于骨骼点数据的动作识别算法得到越来越广泛的关注和研究。近年来,基于图卷积网络(GCN)的骨骼点动作识别模型表现出了很好的性能,但多数基于GCN的模型往往使用固定空间配置分区策略且手动设定各骨骼点之间的连接关系,无法更好适应不同动作的变化特征。针对此问题,本文提出多配置分区的自适应时空图卷积网络用于骨骼点动作识别,通过搜索更合理的配置分区个数并自适应获取关节点连接关系实现对骨骼点动作特征更充分地利用。在NTU-RGBD数据集和Kinetics-Skeleton数据集上的实验表明本文所提方法可获得比目前多数文献更高的动作识别精度。Abstract: Compared with RGB video data, human bone point data has better environmental adaptability and motion expression ability. Therefore, motion recognition algorithms based on bone point data have been paid more and more attention and research. Recently, the motion recognition model of bone points based on graph convolutional network (GCN) has shown good performance, but most of the models based on GCN often use a fixed spatial configuration partitioning strategy and manually set the connection relationship between each bone point, which can not better adapt to the changing characteristics of different movements. To solve this problem, this paper proposes an adaptive spatiotemporal graph convolutional network with multiple configuration partitions for bone point motion recognition. By searching for a more reasonable number of configuration partitions and adaptive acquisition of node connection relations, the motion features of bone points can be more fully utilized. Experiments on NTU-RGBD datasets and Kinetics-Skeleton datasets show that the proposed method can achieve higher accuracy of motion recognition than most existing literatures.