基于单骨架的动作识别的自适应局部感知图形卷积网络

论文标题

基于单骨架的动作识别的自适应局部感知图形卷积网络

Adaptive Local-Component-aware Graph Convolutional Network for One-shot Skeleton-based Action Recognition

论文作者

Zhu, Anqi, Ke, Qiuhong, Gong, Mingming, Bailey, James

论文摘要

基于骨架的动作识别会受到越来越多的关注，因为骨架表示通过消除与动作无关的视觉信息来减少训练数据的量。为了进一步提高样本效率，为基于骨架的动作识别而开发了基于元学习的一击学习解决方案。这些方法根据实例级全局平均嵌入之间的相似性找到最近的邻居。但是，由于对本地不变和嘈杂特征的广义学习不足，这种测量值具有不稳定的代表性，而直觉上，更细粒度的识别通常依赖于确定关键的局部身体运动。为了解决这一限制，我们介绍了自适应的局部成分感知图卷积网络，该网络将比较指标替换为相似性测量的集中之和，这些测量值是在对齐的局部关键空间/时间段的局部嵌入局部嵌入。 NTU-RGB+D 120公共基准的全面单发实验表明，我们的方法比全球嵌入提供了更强的表示，并有助于我们的模型达到最先进。

Skeleton-based action recognition receives increasing attention because the skeleton representations reduce the amount of training data by eliminating visual information irrelevant to actions. To further improve the sample efficiency, meta-learning-based one-shot learning solutions were developed for skeleton-based action recognition. These methods find the nearest neighbor according to the similarity between instance-level global average embedding. However, such measurement holds unstable representativity due to inadequate generalized learning on local invariant and noisy features, while intuitively, more fine-grained recognition usually relies on determining key local body movements. To address this limitation, we present the Adaptive Local-Component-aware Graph Convolutional Network, which replaces the comparison metric with a focused sum of similarity measurements on aligned local embedding of action-critical spatial/temporal segments. Comprehensive one-shot experiments on the public benchmark of NTU-RGB+D 120 indicate that our method provides a stronger representation than the global embedding and helps our model reach state-of-the-art.

下载PDF全文

下载文献需遵守相关版权规定

论文标题