论文标题
动态稀疏R-CNN
Dynamic Sparse R-CNN
论文作者
论文摘要
稀疏R-CNN是最近在稀疏,可学习的提案框和提案功能上进行预测的最新强对象检测基线。在这项工作中,我们建议通过两种动态设计来改善稀疏的R-CNN。首先,稀疏的R-CNN采用了一对一的标签分配方案,在该方案中,匈牙利算法仅适用于每个地面真相的一个正样本。这样的一对一任务可能对学习的提案框和地面真相之间的匹配可能不是最佳的。为了解决这个问题,我们根据最佳传输算法提出动态标签分配(DLA),以在稀疏R-CNN的迭代训练阶段分配增加的正样品。我们将匹配限制为在顺序阶段逐渐松散的,因为后期产生了精确的精确度。其次,在稀疏R-CNN的推理过程中,学到的提案框和功能仍针对不同图像进行固定。在动态卷积的推动下,我们建议动态提案生成(DPG)动态组装多个建议专家,以提供更好的初始提案框和连续培训阶段的功能。因此,DPG可以得出与样品有关的提案框和推理的特征。实验表明,我们的方法称为动态稀疏R-CNN,可以用不同的骨架来提高强稀疏的R-CNN基线以进行对象检测。特别是,动态稀疏R-CNN在2017年可可验证套件上达到了最新的47.2%AP,并在相同的Resnet-50骨架上超过了2.2%的稀疏R-CNN。
Sparse R-CNN is a recent strong object detection baseline by set prediction on sparse, learnable proposal boxes and proposal features. In this work, we propose to improve Sparse R-CNN with two dynamic designs. First, Sparse R-CNN adopts a one-to-one label assignment scheme, where the Hungarian algorithm is applied to match only one positive sample for each ground truth. Such one-to-one assignment may not be optimal for the matching between the learned proposal boxes and ground truths. To address this problem, we propose dynamic label assignment (DLA) based on the optimal transport algorithm to assign increasing positive samples in the iterative training stages of Sparse R-CNN. We constrain the matching to be gradually looser in the sequential stages as the later stage produces the refined proposals with improved precision. Second, the learned proposal boxes and features remain fixed for different images in the inference process of Sparse R-CNN. Motivated by dynamic convolution, we propose dynamic proposal generation (DPG) to assemble multiple proposal experts dynamically for providing better initial proposal boxes and features for the consecutive training stages. DPG thereby can derive sample-dependent proposal boxes and features for inference. Experiments demonstrate that our method, named Dynamic Sparse R-CNN, can boost the strong Sparse R-CNN baseline with different backbones for object detection. Particularly, Dynamic Sparse R-CNN reaches the state-of-the-art 47.2% AP on the COCO 2017 validation set, surpassing Sparse R-CNN by 2.2% AP with the same ResNet-50 backbone.