OSSID：在线自我监督实例检测（以及）姿势估计

论文标题

OSSID：在线自我监督实例检测（以及）姿势估计

OSSID: Online Self-Supervised Instance Detection by (and for) Pose Estimation

论文作者

Gu, Qiao, Okorn, Brian, Held, David

论文摘要

实时对象姿势估计是许多机器人操纵算法所必需的。但是，针对一组特定对象的对象姿势估计的最新方法是训练的。因此，需要重新培训这些方法以估计每个新对象的姿势，通常需要数十天的培训才能以达到最佳性能。在本文中，我们提出了OSSID框架，利用缓慢的零姿势估计器自我避免快速检测算法的训练。然后可以使用该快速检测器将输入过滤到姿势估计器，从而大大提高其推理速度。我们表明，这种自我监督的训练超过了两个广泛使用的对象姿势估计和检测数据集上现有的零射击检测方法的性能，而无需任何人类注释。此外，我们表明，由于能够滤除图像的大部分，姿势估计的最终方法的推理速度明显更快。因此，我们对检测器进行自我监督的在线学习方法（使用伪姿势估计器的伪标记训练）导致在实时速度下进行准确的姿势估计，而无需人工注释。可以在https://georgegu1997.github.io/ossid/上找到补充材料和代码

Real-time object pose estimation is necessary for many robot manipulation algorithms. However, state-of-the-art methods for object pose estimation are trained for a specific set of objects; these methods thus need to be retrained to estimate the pose of each new object, often requiring tens of GPU-days of training for optimal performance. In this paper, we propose the OSSID framework, leveraging a slow zero-shot pose estimator to self-supervise the training of a fast detection algorithm. This fast detector can then be used to filter the input to the pose estimator, drastically improving its inference speed. We show that this self-supervised training exceeds the performance of existing zero-shot detection methods on two widely used object pose estimation and detection datasets, without requiring any human annotations. Further, we show that the resulting method for pose estimation has a significantly faster inference speed, due to the ability to filter out large parts of the image. Thus, our method for self-supervised online learning of a detector (trained using pseudo-labels from a slow pose estimator) leads to accurate pose estimation at real-time speeds, without requiring human annotations. Supplementary materials and code can be found at https://georgegu1997.github.io/OSSID/

下载PDF全文

下载文献需遵守相关版权规定

论文标题