超越转移学习：共同进行行动本地化

论文标题

超越转移学习：共同进行行动本地化

Beyond Transfer Learning: Co-finetuning for Action Localisation

论文作者

Arnab, Anurag, Xiong, Xuehan, Gritsenko, Alexey, Romijnders, Rob, Djolonga, Josip, Dehghani, Mostafa, Sun, Chen, Lučić, Mario, Schmid, Cordelia

论文摘要

转移学习是用于培训小目标数据集深层网络的主要范式。通常可以在大型``上游''数据集上预估计用于分类的模型，因为此类标签易于收集，然后在``下游''任务（例如动作定位）上进行了填充，这些任务是由于其细粒度的注释而较小的。在本文中，我们质疑这种方法，并提出共同访问 - 同时在多个``上游''和``下游''任务上训练单个模型。我们证明，在使用相同的数据总量时，共同传统的表现优于传统转移学习，并展示我们如何轻松地将方法扩展到多个``上游''数据集以进一步提高性能。尤其是，共同访问大大提高了我们下游任务中稀有类的性能，因为它具有正规化的效果，并使网络能够学习在不同数据集之间传输的特征表示。最后，我们观察到与公共，视频分类数据集的共同访问如何，我们能够在具有挑战性的AVA和AVA-Kinetics数据集上实现时空行动定位的最新结果，超过了开发复杂模型的最新作品。

Transfer learning is the predominant paradigm for training deep networks on small target datasets. Models are typically pretrained on large ``upstream'' datasets for classification, as such labels are easy to collect, and then finetuned on ``downstream'' tasks such as action localisation, which are smaller due to their finer-grained annotations. In this paper, we question this approach, and propose co-finetuning -- simultaneously training a single model on multiple ``upstream'' and ``downstream'' tasks. We demonstrate that co-finetuning outperforms traditional transfer learning when using the same total amount of data, and also show how we can easily extend our approach to multiple ``upstream'' datasets to further improve performance. In particular, co-finetuning significantly improves the performance on rare classes in our downstream task, as it has a regularising effect, and enables the network to learn feature representations that transfer between different datasets. Finally, we observe how co-finetuning with public, video classification datasets, we are able to achieve state-of-the-art results for spatio-temporal action localisation on the challenging AVA and AVA-Kinetics datasets, outperforming recent works which develop intricate models.

下载PDF全文

下载文献需遵守相关版权规定

论文标题