论文标题
通过实例唯一的查询学习模棱两可的细分
Learning Equivariant Segmentation with Instance-Unique Querying
论文作者
论文摘要
普遍的最新实例分割方法属于一个基于查询的方案,其中实例掩码是通过使用一组实例感知的嵌入式来查询图像功能来得出的。在这项工作中,我们设计了一个新的培训框架,该框架通过歧视性查询嵌入学习来增强基于查询的模型。它探讨了查询与实例之间关系的两个基本属性,即数据集级别的唯一性和转换等效性。首先,我们的算法使用查询来检索整个培训数据集中的相应实例,而不仅仅是在单个场景中搜索。由于跨场景的查询实例更具挑战性,因此分段者被迫学习更多的歧视性查询,以进行有效的实例分离。其次,我们的算法鼓励图像(实例)表示和查询对几何变换具有等效性,从而导致更健壮的实例拼接匹配。除了四种著名的基于查询的型号(即$ condinst,solov2,sotr和mask2former)之外,我们的培训算法可在可可数据集中提供显着的性能增长(例如$ +1.6-3.2 ap)。此外,我们的算法在LVISV1数据集上促进了SOLOV2的性能。
Prevalent state-of-the-art instance segmentation methods fall into a query-based scheme, in which instance masks are derived by querying the image feature using a set of instance-aware embeddings. In this work, we devise a new training framework that boosts query-based models through discriminative query embedding learning. It explores two essential properties, namely dataset-level uniqueness and transformation equivariance, of the relation between queries and instances. First, our algorithm uses the queries to retrieve the corresponding instances from the whole training dataset, instead of only searching within individual scenes. As querying instances across scenes is more challenging, the segmenters are forced to learn more discriminative queries for effective instance separation. Second, our algorithm encourages both image (instance) representations and queries to be equivariant against geometric transformations, leading to more robust, instance-query matching. On top of four famous, query-based models ($i.e.,$ CondInst, SOLOv2, SOTR, and Mask2Former), our training algorithm provides significant performance gains ($e.g.,$ +1.6 - 3.2 AP) on COCO dataset. In addition, our algorithm promotes the performance of SOLOv2 by 2.7 AP, on LVISv1 dataset.