可靠的区域特征合成器用于零摄像对象检测

论文标题

可靠的区域特征合成器用于零摄像对象检测

Robust Region Feature Synthesizer for Zero-Shot Object Detection

论文作者

Huang, Peiliang, Han, Junwei, Cheng, De, Zhang, Dingwen

论文摘要

零射对象检测旨在合并类语义向量，以实现对（可见和）未见类的检测，但给定未约束的测试图像。在这项研究中，我们揭示了该研究领域的核心挑战：如何合成强大的区域特征（对于看不见的物体），这些特征（对于看不见的物体）与真实样品一样多样化且可分开的阶层间，以便可以对它们进行强大的未见对象探测器的训练。为了应对这些挑战，我们构建了一个新颖的零拍对象检测框架，该框架包含一个内部的语义分化组件和一个阶层间结构保存组件。前者用于实现一对多映射，以从每个类语义向量中获得各种视觉特征，从而阻止将实际看不见的对象误解为图像背景。而后者则用于避免合成的特征过于散射，无法混合上层和前景的关系。为了证明拟议方法的有效性，进行了对Pascal VOC，可可和DIOR数据集的全面实验。值得注意的是，我们的方法实现了Pascal VOC和可可的新最新性能，这是第一个在遥感图像中执行零摄像对象检测的研究。

Zero-shot object detection aims at incorporating class semantic vectors to realize the detection of (both seen and) unseen classes given an unconstrained test image. In this study, we reveal the core challenges in this research area: how to synthesize robust region features (for unseen objects) that are as intra-class diverse and inter-class separable as the real samples, so that strong unseen object detectors can be trained upon them. To address these challenges, we build a novel zero-shot object detection framework that contains an Intra-class Semantic Diverging component and an Inter-class Structure Preserving component. The former is used to realize the one-to-more mapping to obtain diverse visual features from each class semantic vector, preventing miss-classifying the real unseen objects as image backgrounds. While the latter is used to avoid the synthesized features too scattered to mix up the inter-class and foreground-background relationship. To demonstrate the effectiveness of the proposed approach, comprehensive experiments on PASCAL VOC, COCO, and DIOR datasets are conducted. Notably, our approach achieves the new state-of-the-art performance on PASCAL VOC and COCO and it is the first study to carry out zero-shot object detection in remote sensing imagery.

下载PDF全文

下载文献需遵守相关版权规定

论文标题