Interactron：具体体现的自适应对象检测

论文标题

Interactron：具体体现的自适应对象检测

Interactron: Embodied Adaptive Object Detection

论文作者

Kotar, Klemen, Mottaghi, Roozbeh

论文摘要

多年来，已经提出了各种方法来解决对象检测问题。最近，由于强大的深神经网络的出现，我们在这个领域取得了长足的进步。但是，这些方法中通常有两个主要的假设。首先，该模型在固定训练集上进行培训，并在预录的测试集上进行评估。其次，在训练阶段后将模型冷冻，因此训练完成后未进行进一步的更新。这两个假设限制了这些方法对现实世界设置的适用性。在本文中，我们提出了Interactron，这是一种在交互式环境中自适应对象检测的方法，该方法是在通过在不同环境中导航的体现剂观察到的图像中执行对象检测。我们的想法是在推理期间继续培训，并在测试时间调整模型，而无需与环境互动任何明确的监督。我们的自适应对象检测模型比DETR（最新的高性能对象检测器DETR相比，AP（AP50中的12.7点）提供了7.2点的改善。此外，我们表明我们的对象检测模型适应具有完全不同外观特征的环境，并且在其中表现良好。该代码可在以下网址获得：https：//github.com/allenai/interactron。

Over the years various methods have been proposed for the problem of object detection. Recently, we have witnessed great strides in this domain owing to the emergence of powerful deep neural networks. However, there are typically two main assumptions common among these approaches. First, the model is trained on a fixed training set and is evaluated on a pre-recorded test set. Second, the model is kept frozen after the training phase, so no further updates are performed after the training is finished. These two assumptions limit the applicability of these methods to real-world settings. In this paper, we propose Interactron, a method for adaptive object detection in an interactive setting, where the goal is to perform object detection in images observed by an embodied agent navigating in different environments. Our idea is to continue training during inference and adapt the model at test time without any explicit supervision via interacting with the environment. Our adaptive object detection model provides a 7.2 point improvement in AP (and 12.7 points in AP50) over DETR, a recent, high-performance object detector. Moreover, we show that our object detection model adapts to environments with completely different appearance characteristics, and performs well in them. The code is available at: https://github.com/allenai/interactron .

下载PDF全文

下载文献需遵守相关版权规定

论文标题