论文标题
GTNET:用于零摄像对象检测的生成传输网络
GTNet: Generative Transfer Network for Zero-Shot Object Detection
论文作者
论文摘要
我们为零射击对象检测(ZSD)提出了一个生成传输网络(GTNET)。 GTNET由对象检测模块和知识传输模块组成。对象检测模块可以学习大规模的可见域知识。知识传输模块利用特征合成器生成看不见的类特征,这些特征被应用于训练对象检测模块的新分类层。为了使每个看不见类的特征都具有级别的差异和IOU差异,我们将IOU AWARE INAWARE生成对抗网络(IOUGAN)设计为特征合成器,可以轻松地集成到GTNET中。具体而言,Iougan由三个单元模型组成:类功能生成单元(CFU),前景功能生成单元(FFU)和背景功能生成单元(BFU)。 CFU在类上语义嵌入条件下的阶级方差生成了看不见的特征。 FFU和BFU将IOU差异添加到CFU的结果中,分别产生特定于类的前景和背景特征。我们在三个公共数据集上评估了我们的方法,结果表明,我们的方法对最新的ZSD方法有利。
We propose a Generative Transfer Network (GTNet) for zero shot object detection (ZSD). GTNet consists of an Object Detection Module and a Knowledge Transfer Module. The Object Detection Module can learn large-scale seen domain knowledge. The Knowledge Transfer Module leverages a feature synthesizer to generate unseen class features, which are applied to train a new classification layer for the Object Detection Module. In order to synthesize features for each unseen class with both the intra-class variance and the IoU variance, we design an IoU-Aware Generative Adversarial Network (IoUGAN) as the feature synthesizer, which can be easily integrated into GTNet. Specifically, IoUGAN consists of three unit models: Class Feature Generating Unit (CFU), Foreground Feature Generating Unit (FFU), and Background Feature Generating Unit (BFU). CFU generates unseen features with the intra-class variance conditioned on the class semantic embeddings. FFU and BFU add the IoU variance to the results of CFU, yielding class-specific foreground and background features, respectively. We evaluate our method on three public datasets and the results demonstrate that our method performs favorably against the state-of-the-art ZSD approaches.