使用先验偏见的抵抗训练：朝向公正的场景图生成

论文标题

使用先验偏见的抵抗训练：朝向公正的场景图生成

Resistance Training using Prior Bias: toward Unbiased Scene Graph Generation

论文作者

Chen, Chao, Zhan, Yibing, Yu, Baosheng, Liu, Liu, Luo, Yong, Du, Bo

论文摘要

场景图生成（SGG）旨在使用对象和成对关系建立场景的结构化表示，从而使下游任务受益。但是，由于训练数据的长尾分布，当前的SGG方法通常会遭受次优场景图的生成。为了解决这个问题，我们建议使用先验偏置（RTPB）进行场景图生成的阻力训练。具体而言，RTPB使用基于分布式的先验偏见来提高模型在训练过程中较低频繁关系的检测能力，从而改善了尾巴类别的模型推广性。此外，为了进一步探索对象和关系的上下文信息，我们设计了一个被称为双变压器（DTRANS）的上下文编码骨干网络。我们在非常流行的基准VG150上进行了广泛的实验，以证明我们方法对无偏见的场景图生成的有效性。在通常的情况下，当应用于当前SGG方法时，我们的RTPB在平均召回下的提高了10％以上。此外，带有RTPB的DTRAN的表现几乎优于所有最先进的方法。

Scene Graph Generation (SGG) aims to build a structured representation of a scene using objects and pairwise relationships, which benefits downstream tasks. However, current SGG methods usually suffer from sub-optimal scene graph generation because of the long-tailed distribution of training data. To address this problem, we propose Resistance Training using Prior Bias (RTPB) for the scene graph generation. Specifically, RTPB uses a distributed-based prior bias to improve models' detecting ability on less frequent relationships during training, thus improving the model generalizability on tail categories. In addition, to further explore the contextual information of objects and relationships, we design a contextual encoding backbone network, termed as Dual Transformer (DTrans). We perform extensive experiments on a very popular benchmark, VG150, to demonstrate the effectiveness of our method for the unbiased scene graph generation. In specific, our RTPB achieves an improvement of over 10% under the mean recall when applied to current SGG methods. Furthermore, DTrans with RTPB outperforms nearly all state-of-the-art methods with a large margin.

下载PDF全文

下载文献需遵守相关版权规定

论文标题