与周期协会的视频可识别的可推广重新识别

论文标题

与周期协会的视频可识别的可推广重新识别

Generalizable Re-Identification from Videos with Cycle Association

论文作者

Wang, Zhongdao, Dou, Zhaopeng, Zhang, Jingwei, Zheng, Liang, Sun, Yifan, Li, Yali, Wang, Shengjin

论文摘要

在本文中，我们有兴趣从未标记的视频中学习一个可推广的人重新识别（RE-ID）表示。与1）相比，训练和测试集通常在同一领域下的流行无监督的重新ID设置，以及2）流行的域概括（DG）RE-ID设置，其中训练样品标记了训练样品，我们的小说场景结合了他们的关键挑战：训练样本不受欢迎，并收集了各种与测试域相处的各种领域。换句话说，我们旨在以无监督的方式学习代表，并直接将学习的表示形式用于新的领域中的重新ID。为了实现这一目标，我们做出了两个主要贡献：首先，我们提出了循环协会（CYCAS），这是一种可扩展的自我监督学习方法，用于重新培训，训练较低的复杂性；其次，我们构建了一个名为LMP-VIDEO的大规模的无标记的重新ID数据集，该数据集是为该方法量身定制的。具体而言，Cycas通过实例连续的视频框架对之间的实例关联来学习RE-ID功能，而训练成本仅是线性的，使数据大小是线性的，从而使大规模培训成为可能。另一方面，LMP-VIDEO数据集非常大，包含5000万个未标记的人图像，从10k YouTube视频中裁剪出来，因此足以充当肥沃的土壤，以进行自我监督的学习。经过在LMP-VIDEO的培训，我们表明Cycas对新领域学习了良好的概括。实现的结果有时甚至超过了可监督的域名模型。值得注意的是，CYCA在Market-1501中获得82.2％的排名1，而MSMT17的排名为49.0％，人类注释为零，超过了最先进的监督DG RE-ID方法。此外，我们还展示了在规范无监督的重新ID和预处理和最重要的情况下的优势。

In this paper, we are interested in learning a generalizable person re-identification (re-ID) representation from unlabeled videos. Compared with 1) the popular unsupervised re-ID setting where the training and test sets are typically under the same domain, and 2) the popular domain generalization (DG) re-ID setting where the training samples are labeled, our novel scenario combines their key challenges: the training samples are unlabeled, and collected form various domains which do no align with the test domain. In other words, we aim to learn a representation in an unsupervised manner and directly use the learned representation for re-ID in novel domains. To fulfill this goal, we make two main contributions: First, we propose Cycle Association (CycAs), a scalable self-supervised learning method for re-ID with low training complexity; and second, we construct a large-scale unlabeled re-ID dataset named LMP-video, tailored for the proposed method. Specifically, CycAs learns re-ID features by enforcing cycle consistency of instance association between temporally successive video frame pairs, and the training cost is merely linear to the data size, making large-scale training possible. On the other hand, the LMP-video dataset is extremely large, containing 50 million unlabeled person images cropped from over 10K Youtube videos, therefore is sufficient to serve as fertile soil for self-supervised learning. Trained on LMP-video, we show that CycAs learns good generalization towards novel domains. The achieved results sometimes even outperform supervised domain generalizable models. Remarkably, CycAs achieves 82.2% Rank-1 on Market-1501 and 49.0% Rank-1 on MSMT17 with zero human annotation, surpassing state-of-the-art supervised DG re-ID methods. Moreover, we also demonstrate the superiority of CycAs under the canonical unsupervised re-ID and the pretrain-and-finetune scenarios.

下载PDF全文

下载文献需遵守相关版权规定

论文标题