论文标题

SG-SHUFFLE:场景图生成的多光值洗牌变压器

SG-Shuffle: Multi-aspect Shuffle Transformer for Scene Graph Generation

论文作者

Bui, Anh Duc, Han, Soyeon Caren, Poon, Josiah

论文摘要

场景图生成(SGG)为人类理解以及视觉理解任务提供了图像的全面表示。由于对象的长尾偏置问题和可用的注释数据中的谓词标签,因此从当前方法产生的场景图可能会偏向常见的,非信息性关系标签。关系有时可以是非少数排斥的,可以从多种角度进行描述,例如几何关系或语义关系,这使得预测最合适的关系标签更具挑战性。在这项工作中,我们提出了具有3个组件的场景图生成的SG-SHUFFLE管道:1)平行变压器编码器,该编码者通过将关系标签分组为相似目的的组来学习以更独特的方式预测对象关系; 2)Shuffle Transformer,学会从上一步中生成的特定类别特征选择最终关系标签; 3)加权CE损失,用于减轻数据集不平衡的训练偏见。

Scene Graph Generation (SGG) serves a comprehensive representation of the images for human understanding as well as visual understanding tasks. Due to the long tail bias problem of the object and predicate labels in the available annotated data, the scene graph generated from current methodologies can be biased toward common, non-informative relationship labels. Relationship can sometimes be non-mutually exclusive, which can be described from multiple perspectives like geometrical relationships or semantic relationships, making it even more challenging to predict the most suitable relationship label. In this work, we proposed the SG-Shuffle pipeline for scene graph generation with 3 components: 1) Parallel Transformer Encoder, which learns to predict object relationships in a more exclusive manner by grouping relationship labels into groups of similar purpose; 2) Shuffle Transformer, which learns to select the final relationship labels from the category-specific feature generated in the previous step; and 3) Weighted CE loss, used to alleviate the training bias caused by the imbalanced dataset.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源