论文标题
时间倒转的扩散张量变压器:一个新的距离对象检测的新宗旨
Time-rEversed diffusioN tEnsor Transformer: A new TENET of Few-Shot Object Detection
论文作者
论文摘要
在本文中,我们解决了几乎没有射门对象检测的具有挑战性的问题。现有的FSOD管道(i)使用导致信息丢失的平均供电表示;和/或(ii)可以帮助检测对象实例的位置信息。因此,此类管道对较大的类内外观和支持图像之间的几何变化敏感。为了解决这些缺点,我们提出了一个时间转移的扩散张量变压器(TENET),i)形成高级张量表示,以捕获高度歧视的多路特征出现,并且ii)使用了一个动态提取查询图像之间的相关性的变压器,而不是整个平均辅助支持集合,而不是单个平均液体支持的启动。我们还提出了一个配备高阶表示的变压器关系头(TRH),该关系较高的表示,该关系编码查询区域与整个支持集之间的相关性,同时对对象实例的位置可变性敏感。我们的模型在Pascal VOC,FSOD和COCO上取得了最新的结果。
In this paper, we tackle the challenging problem of Few-shot Object Detection. Existing FSOD pipelines (i) use average-pooled representations that result in information loss; and/or (ii) discard position information that can help detect object instances. Consequently, such pipelines are sensitive to large intra-class appearance and geometric variations between support and query images. To address these drawbacks, we propose a Time-rEversed diffusioN tEnsor Transformer (TENET), which i) forms high-order tensor representations that capture multi-way feature occurrences that are highly discriminative, and ii) uses a transformer that dynamically extracts correlations between the query image and the entire support set, instead of a single average-pooled support embedding. We also propose a Transformer Relation Head (TRH), equipped with higher-order representations, which encodes correlations between query regions and the entire support set, while being sensitive to the positional variability of object instances. Our model achieves state-of-the-art results on PASCAL VOC, FSOD, and COCO.