论文标题
在开放和封闭的设置设置中解决视觉搜索
Addressing Visual Search in Open and Closed Set Settings
论文作者
论文摘要
在大图像中搜索小物体是一项对于当前的深度学习系统既有挑战性的任务,又在众多现实世界中(例如遥感和医学成像)很重要。彻底扫描非常大的图像在计算上是昂贵的,尤其是在足以捕获小物体的分辨率下。感兴趣的对象越小,越可能被混乱或以其他方式被认为无关紧要的可能性。我们在两个互补问题的背景下检查了这些问题:封闭设置的对象检测和开放式目标搜索。首先,我们提出了一种从低分辨率GIST图像中预测像素级对象的方法,然后我们使用该方法选择以高分辨率在本地执行对象检测的区域。这种方法的好处是不被固定在预定的网格上,因此,与现有方法相比,高分辨率的高分辨率瞥见所需的较少。其次,我们提出了一种开放式视觉搜索的新型策略,该策略旨在找到可能以前看不见的目标类别的所有实例,并由单个图像定义。我们通过概率的贝叶斯镜头来解释这两种检测问题,从而在最大的检测步骤中最大程度地说明了我们的方法中产生的物体图作为先验。我们评估了我们的贴片选择策略与这种目标搜索方法的组合以及我们的补丁选择策略与标准对象检测方法的组合的端到端性能。我们方法的两个要素都显着超过了基线策略。
Searching for small objects in large images is a task that is both challenging for current deep learning systems and important in numerous real-world applications, such as remote sensing and medical imaging. Thorough scanning of very large images is computationally expensive, particularly at resolutions sufficient to capture small objects. The smaller an object of interest, the more likely it is to be obscured by clutter or otherwise deemed insignificant. We examine these issues in the context of two complementary problems: closed-set object detection and open-set target search. First, we present a method for predicting pixel-level objectness from a low resolution gist image, which we then use to select regions for performing object detection locally at high resolution. This approach has the benefit of not being fixed to a predetermined grid, thereby requiring fewer costly high-resolution glimpses than existing methods. Second, we propose a novel strategy for open-set visual search that seeks to find all instances of a target class which may be previously unseen and is defined by a single image. We interpret both detection problems through a probabilistic, Bayesian lens, whereby the objectness maps produced by our method serve as priors in a maximum-a-posteriori approach to the detection step. We evaluate the end-to-end performance of both the combination of our patch selection strategy with this target search approach and the combination of our patch selection strategy with standard object detection methods. Both elements of our approach are seen to significantly outperform baseline strategies.