在多域基准测试上重新思考几射击对象检测

论文标题

在多域基准测试上重新思考几射击对象检测

Rethinking Few-Shot Object Detection on a Multi-Domain Benchmark

论文作者

Lee, Kibok, Yang, Hao, Chakraborty, Satyaki, Cai, Zhaowei, Swaminathan, Gurumurthy, Ravichandran, Avinash, Dabeer, Onkar

论文摘要

大多数现有的作品在少数拍摄对象检测（FSOD）上的作品集中在一个设置上，其中预训练和几乎没有弹出的学习数据集来自类似的域。但是，在多个域中，很少有射击算法很重要。因此，评估需要反映广泛的应用。我们提出了一个多域数少数对象检测（MOFSOD）基准，该基准由来自各个域的10个数据集组成，以评估FSOD算法。我们全面分析了冷冻层，不同架构以及不同的预训练数据集对FSOD性能的影响。我们的经验结果表明，在以前的作品中尚未探索过的几个关键因素：1）与以前的信念相反，在多域基准测试中，微调（FT）是FSOD的强大基线，在PAR上表现出色或比最先进的ART（SOTA）算法更好。 2）利用FT作为基线使我们能够探索多个体系结构，并且我们发现它们对下游的几杆任务产生重大影响，即使具有类似的训练性训练性能； 3）通过取消预训练和少数学习学习，MOFSOD使我们能够探索不同的预训练数据集的影响，并且正确的选择可以显着提高下游任务的性能。基于这些发现，我们列出了可能的调查途径，以改善FSOD性能，并对现有算法进行两次简单修改，从而导致MOFSOD基准上的SOTA性能。该代码可在https://github.com/amazon-research/few-shot-object-detection-benchmark上获得。

Most existing works on few-shot object detection (FSOD) focus on a setting where both pre-training and few-shot learning datasets are from a similar domain. However, few-shot algorithms are important in multiple domains; hence evaluation needs to reflect the broad applications. We propose a Multi-dOmain Few-Shot Object Detection (MoFSOD) benchmark consisting of 10 datasets from a wide range of domains to evaluate FSOD algorithms. We comprehensively analyze the impacts of freezing layers, different architectures, and different pre-training datasets on FSOD performance. Our empirical results show several key factors that have not been explored in previous works: 1) contrary to previous belief, on a multi-domain benchmark, fine-tuning (FT) is a strong baseline for FSOD, performing on par or better than the state-of-the-art (SOTA) algorithms; 2) utilizing FT as the baseline allows us to explore multiple architectures, and we found them to have a significant impact on down-stream few-shot tasks, even with similar pre-training performances; 3) by decoupling pre-training and few-shot learning, MoFSOD allows us to explore the impact of different pre-training datasets, and the right choice can boost the performance of the down-stream tasks significantly. Based on these findings, we list possible avenues of investigation for improving FSOD performance and propose two simple modifications to existing algorithms that lead to SOTA performance on the MoFSOD benchmark. The code is available at https://github.com/amazon-research/few-shot-object-detection-benchmark.

下载PDF全文

下载文献需遵守相关版权规定

论文标题