论文标题
几个射击分割的协方差矩阵的双重变形汇总
Doubly Deformable Aggregation of Covariance Matrices for Few-shot Segmentation
论文作者
论文摘要
具有很少带注释的样本的训练语义分割模型在各种现实世界应用中具有巨大的潜力。对于少数拍摄的分段任务,主要挑战是如何准确地测量使用有限的培训数据之间的支持样本和查询样本之间的语义对应关系。为了解决这个问题,我们建议用可变形的4D变压器汇总可学习的协方差矩阵,以有效预测分割图。具体来说,在这项工作中,我们首先设计了一种新颖的艰难示例挖掘机制,以学习高斯过程的协方差内核。在对应测量中,学到的协方差内核函数比现有基于余弦相似性的方法具有很大的优势。基于学习的协方差内核,有效的双重可变形4D变压器模块旨在适应骨料特征相似性映射到分割结果。通过结合这两种设计,提出的方法不仅可以在公共基准上设置新的最新性能,而且比现有方法更快地收敛。三个公共数据集的实验证明了我们方法的有效性。
Training semantic segmentation models with few annotated samples has great potential in various real-world applications. For the few-shot segmentation task, the main challenge is how to accurately measure the semantic correspondence between the support and query samples with limited training data. To address this problem, we propose to aggregate the learnable covariance matrices with a deformable 4D Transformer to effectively predict the segmentation map. Specifically, in this work, we first devise a novel hard example mining mechanism to learn covariance kernels for the Gaussian process. The learned covariance kernel functions have great advantages over existing cosine similarity-based methods in correspondence measurement. Based on the learned covariance kernels, an efficient doubly deformable 4D Transformer module is designed to adaptively aggregate feature similarity maps into segmentation results. By combining these two designs, the proposed method can not only set new state-of-the-art performance on public benchmarks, but also converge extremely faster than existing methods. Experiments on three public datasets have demonstrated the effectiveness of our method.