Epic-kitchens-100无监督的域名适应挑战2022：HNU-FPV技术报告团队

论文标题

Epic-kitchens-100无监督的域名适应挑战2022：HNU-FPV技术报告团队

EPIC-KITCHENS-100 Unsupervised Domain Adaptation Challenge for Action Recognition 2022: Team HNU-FPV Technical Report

论文作者

Lin, Nie, Cai, Minjie

论文摘要

在本报告中，我们将提交的技术细节介绍给2022 Epic-Kitchens无监督的域名（UDA）挑战。现有的UDA方法使从整个源和目标域中从整个视频片段中提取的全局功能保持一致，但在视频识别中遇到了功能匹配的空间冗余。通过观察到，在大多数情况下，每个视频框架中的一个小图像区域可以足以满足行动识别任务的信息，我们建议利用信息图像区域以执行有效的域名。具体而言，我们首先使用轻型CNN来提取输入两流视频帧的全局信息，并通过基于可区分的插值选择策略选择信息性图像补丁。然后，来自视频框架的全局信息和来自图像补丁的本地信息将通过现有视频适应方法（即TA3N）处理，以便为源域和目标域执行功能对齐。我们的方法（无模型合奏）在今年的Epic-Kitchens-100测试集中排名第四。

In this report, we present the technical details of our submission to the 2022 EPIC-Kitchens Unsupervised Domain Adaptation (UDA) Challenge. Existing UDA methods align the global features extracted from the whole video clips across the source and target domains but suffer from the spatial redundancy of feature matching in video recognition. Motivated by the observation that in most cases a small image region in each video frame can be informative enough for the action recognition task, we propose to exploit informative image regions to perform efficient domain alignment. Specifically, we first use lightweight CNNs to extract the global information of the input two-stream video frames and select the informative image patches by a differentiable interpolation-based selection strategy. Then the global information from videos frames and local information from image patches are processed by an existing video adaptation method, i.e., TA3N, in order to perform feature alignment for the source domain and the target domain. Our method (without model ensemble) ranks 4th among this year's teams on the test set of EPIC-KITCHENS-100.

下载PDF全文

下载文献需遵守相关版权规定

论文标题