PRIMA6D：增强和健壮的6D姿势估计的旋转原始重建

论文标题

PRIMA6D：增强和健壮的6D姿势估计的旋转原始重建

PrimA6D: Rotational Primitive Reconstruction for Enhanced and Robust 6D Pose Estimation

论文作者

Jeon, Myung-Hwan, Kim, Ayoung

论文摘要

在本文中，我们使用单个图像作为输入介绍了基于旋转原始预测的6D对象姿势估计。我们使用带有遮挡的单个图像求解相对于相机的6D对象姿势。许多最新的最新技术（SOTA）两步方法都利用了图像关键点提取，然后进行了PNP回归以进行姿势估计。我们建议不依靠对象上的边界框或关键点，而是建议学习取向引起的原始性，以实现姿势估计精度，而不管对象大小如何。我们利用各种自动编码器（VAE）来学习这种基本原始及其相关的关键。然后，使用从重建的原始图像推断出的关键点用于使用PNP回归旋转。最后，我们在单独的本地化模块中计算翻译以完成整个6D姿势估计。当通过公共数据集进行评估时，提出的方法对linemod，coclusion linemod和YCB-Video数据集产生了显着改进。我们进一步提供了一个仅由合成的训练的案例，其表现与现有方法相当的性能，该方法在训练阶段需要真实图像。

In this paper, we introduce a rotational primitive prediction based 6D object pose estimation using a single image as an input. We solve for the 6D object pose of a known object relative to the camera using a single image with occlusion. Many recent state-of-the-art (SOTA) two-step approaches have exploited image keypoints extraction followed by PnP regression for pose estimation. Instead of relying on bounding box or keypoints on the object, we propose to learn orientation-induced primitive so as to achieve the pose estimation accuracy regardless of the object size. We leverage a Variational AutoEncoder (VAE) to learn this underlying primitive and its associated keypoints. The keypoints inferred from the reconstructed primitive image are then used to regress the rotation using PnP. Lastly, we compute the translation in a separate localization module to complete the entire 6D pose estimation. When evaluated over public datasets, the proposed method yields a notable improvement over the LINEMOD, the Occlusion LINEMOD, and the YCB-Video dataset. We further provide a synthetic-only trained case presenting comparable performance to the existing methods which require real images in the training phase.

下载PDF全文

下载文献需遵守相关版权规定

论文标题