GTNET：用于零摄像对象检测的生成传输网络

论文标题

GTNET：用于零摄像对象检测的生成传输网络

GTNet: Generative Transfer Network for Zero-Shot Object Detection

论文作者

Zhao, Shizhen, Gao, Changxin, Shao, Yuanjie, Li, Lerenhan, Yu, Changqian, Ji, Zhong, Sang, Nong

论文摘要

我们为零射击对象检测（ZSD）提出了一个生成传输网络（GTNET）。 GTNET由对象检测模块和知识传输模块组成。对象检测模块可以学习大规模的可见域知识。知识传输模块利用特征合成器生成看不见的类特征，这些特征被应用于训练对象检测模块的新分类层。为了使每个看不见类的特征都具有级别的差异和IOU差异，我们将IOU AWARE INAWARE生成对抗网络（IOUGAN）设计为特征合成器，可以轻松地集成到GTNET中。具体而言，Iougan由三个单元模型组成：类功能生成单元（CFU），前景功能生成单元（FFU）和背景功能生成单元（BFU）。 CFU在类上语义嵌入条件下的阶级方差生成了看不见的特征。 FFU和BFU将IOU差异添加到CFU的结果中，分别产生特定于类的前景和背景特征。我们在三个公共数据集上评估了我们的方法，结果表明，我们的方法对最新的ZSD方法有利。

We propose a Generative Transfer Network (GTNet) for zero shot object detection (ZSD). GTNet consists of an Object Detection Module and a Knowledge Transfer Module. The Object Detection Module can learn large-scale seen domain knowledge. The Knowledge Transfer Module leverages a feature synthesizer to generate unseen class features, which are applied to train a new classification layer for the Object Detection Module. In order to synthesize features for each unseen class with both the intra-class variance and the IoU variance, we design an IoU-Aware Generative Adversarial Network (IoUGAN) as the feature synthesizer, which can be easily integrated into GTNet. Specifically, IoUGAN consists of three unit models: Class Feature Generating Unit (CFU), Foreground Feature Generating Unit (FFU), and Background Feature Generating Unit (BFU). CFU generates unseen features with the intra-class variance conditioned on the class semantic embeddings. FFU and BFU add the IoU variance to the results of CFU, yielding class-specific foreground and background features, respectively. We evaluate our method on three public datasets and the results demonstrate that our method performs favorably against the state-of-the-art ZSD approaches.

下载PDF全文

下载文献需遵守相关版权规定

论文标题