论文标题
基于有效骨架的人类动作识别的时间注意仪表卷积网络
Temporal Attention-Augmented Graph Convolutional Network for Efficient Skeleton-Based Human Action Recognition
论文作者
论文摘要
图形卷积网络(GCN)在建模非欧盟数据结构方面非常成功,例如形成以时空图模型的身体骨架序列。大多数基于GCN的动作识别方法都使用具有较高计算复杂性的深馈式网络来处理动作中的所有骨架。这导致了大量的浮点操作(从16克到100克拖鞋不等)来处理一个样本,从而使其在受限的计算应用程序方案中采用。在本文中,我们提出了一个时间注意模块(TAM),以通过在网络早期层中选择最有用的动作骨架来提高基于骨架的动作识别的效率。我们将TAM纳入轻型GCN拓扑中,以进一步减少计算的总数。两个基准数据集的实验结果表明,所提出的方法的表现优于基线基线方法,而计算数量却少2.9倍。此外,它与最先进的表现相当,最多减少了9.6倍的计算数量。
Graph convolutional networks (GCNs) have been very successful in modeling non-Euclidean data structures, like sequences of body skeletons forming actions modeled as spatio-temporal graphs. Most GCN-based action recognition methods use deep feed-forward networks with high computational complexity to process all skeletons in an action. This leads to a high number of floating point operations (ranging from 16G to 100G FLOPs) to process a single sample, making their adoption in restricted computation application scenarios infeasible. In this paper, we propose a temporal attention module (TAM) for increasing the efficiency in skeleton-based action recognition by selecting the most informative skeletons of an action at the early layers of the network. We incorporate the TAM in a light-weight GCN topology to further reduce the overall number of computations. Experimental results on two benchmark datasets show that the proposed method outperforms with a large margin the baseline GCN-based method while having 2.9 times less number of computations. Moreover, it performs on par with the state-of-the-art with up to 9.6 times less number of computations.