论文标题

6D物体姿势估计的猛烈支持的自我训练

SLAM-Supported Self-Training for 6D Object Pose Estimation

论文作者

Lu, Ziqi, Zhang, Yihao, Doherty, Kevin, Severinsen, Odin, Yang, Ethan, Leonard, John

论文摘要

对象姿势预测的最新进展为机器人在导航期间构建对象级场景表示形式提供了有希望的途径。但是,当我们在新颖环境中部署机器人时,分布式数据可能会降低预测性能。为了减轻域间隙,我们可以使用机器人捕获图像作为伪标签的预测在目标域中进行自我训练,以微调对象姿势估计器。不幸的是,姿势预测通常是折磨的,因此很难量化其不确定性,这可能会导致低质量的伪标记数据。为了解决这个问题,我们提出了一种猛烈支持的自我训练方法,利用机器人对3D场景几何形状的理解来增强对象姿势推理性能。将姿势预测与机器人探光仪相结合,我们制定并求解姿势图优化,以完善对象姿势估计,并使伪标签在整个帧中更加一致。我们将姿势预测协方差纳入变量中,以自动建模其不确定性。这种自动协方差调整(ACT)过程可以在组件级别拟合6D姿势预测噪声,从而导致高质量的伪训练数据。我们在YCB视频数据集和实际机器人实验中使用深对象姿势估计器(DOPE)测试我们的方法。它在两种测试的姿势预测中分别达到34.3%和17.8%的精度提高。我们的代码可在https://github.com/520xyxyzq/slam-super-6d上找到。

Recent progress in object pose prediction provides a promising path for robots to build object-level scene representations during navigation. However, as we deploy a robot in novel environments, the out-of-distribution data can degrade the prediction performance. To mitigate the domain gap, we can potentially perform self-training in the target domain, using predictions on robot-captured images as pseudo labels to fine-tune the object pose estimator. Unfortunately, the pose predictions are typically outlier-corrupted, and it is hard to quantify their uncertainties, which can result in low-quality pseudo-labeled data. To address the problem, we propose a SLAM-supported self-training method, leveraging robot understanding of the 3D scene geometry to enhance the object pose inference performance. Combining the pose predictions with robot odometry, we formulate and solve pose graph optimization to refine the object pose estimates and make pseudo labels more consistent across frames. We incorporate the pose prediction covariances as variables into the optimization to automatically model their uncertainties. This automatic covariance tuning (ACT) process can fit 6D pose prediction noise at the component level, leading to higher-quality pseudo training data. We test our method with the deep object pose estimator (DOPE) on the YCB video dataset and in real robot experiments. It achieves respectively 34.3% and 17.8% accuracy enhancements in pose prediction on the two tests. Our code is available at https://github.com/520xyxyzq/slam-super-6d.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源