以人为本的以人为本的先前指导和任务依赖性的多任务代表性学习进行行动识别预训练

论文标题

以人为本的以人为本的先前指导和任务依赖性的多任务代表性学习进行行动识别预训练

Human-Centered Prior-Guided and Task-Dependent Multi-Task Representation Learning for Action Recognition Pre-Training

论文作者

Wang, Guanhong, Lu, Keyu, Zhou, Yang, He, Zhanhao, Wang, Gaoang

论文摘要

最近，在自我监督的行动识别方面取得了很多进展。大多数现有方法都强调了视频之间的对比关系，包括外观和运动一致性。但是，现有的预训练方法仍然存在两个主要问题：1）学习的表示是中立的，并且对特定任务没有信息； 2）基于多任务学习的预训练有时由于不同任务的域不一致而导致次级最佳解决方案。为了解决上述问题，我们提出了一个新颖的行动识别前训练框架，该框架利用以人为中心的先验知识，从而产生更有信息的表示形式，并通过使用任务依赖性表示，避免了多个任务之间的冲突。具体而言，我们将知识从人类解析模型中提炼出来，以丰富表示的语义能力。此外，我们将知识蒸馏与对比度学习相结合，构成了一个依赖任务的多任务框架。我们在两个流行的行动识别任务（即UCF101和HMDB51）上实现了最新的性能，从而验证了我们方法的有效性。

Recently, much progress has been made for self-supervised action recognition. Most existing approaches emphasize the contrastive relations among videos, including appearance and motion consistency. However, two main issues remain for existing pre-training methods: 1) the learned representation is neutral and not informative for a specific task; 2) multi-task learning-based pre-training sometimes leads to sub-optimal solutions due to inconsistent domains of different tasks. To address the above issues, we propose a novel action recognition pre-training framework, which exploits human-centered prior knowledge that generates more informative representation, and avoids the conflict between multiple tasks by using task-dependent representations. Specifically, we distill knowledge from a human parsing model to enrich the semantic capability of representation. In addition, we combine knowledge distillation with contrastive learning to constitute a task-dependent multi-task framework. We achieve state-of-the-art performance on two popular benchmarks for action recognition task, i.e., UCF101 and HMDB51, verifying the effectiveness of our method.

下载PDF全文

下载文献需遵守相关版权规定

论文标题