论文标题

引入姿势的一致性和经纱对齐,以进行自我监督的6D对象姿势估计颜色图像

Introducing Pose Consistency and Warp-Alignment for Self-Supervised 6D Object Pose Estimation in Color Images

论文作者

Sock, Juil, Garcia-Hernando, Guillermo, Armagan, Anil, Kim, Tae-Kyun

论文摘要

最成功的方法来估计物体的6D姿势通常通过在现实世界图像中用带注释的姿势监督学习来训练神经网络。这些注释通常是昂贵的,一个常见的解决方法是在合成场景上生成和训练,当模型部署在现实世界中时,有限的概括有限。在这项工作中,提出了一个两个阶段的6D对象姿势估计器框架,可以在现有的基于神经网络的方法之上应用,并且不需要在真实图像上进行姿势注释。第一个自我监督阶段实现了渲染的预测与真实输入图像之间的姿势一致性,从而缩小了两个域之间的差距。第二阶段微调通过在不同对象视图对之间执行光度计一致性,从而微调了先前训练的模型,其中一个图像被扭曲并对齐以匹配另一个图像,从而可以进行比较。在没有实际图像注释和深度信息的情况下,将提出的框架应用于最近两种方法之上的拟议框架,与仅在合成数据,域适应性基线和同时进行的自我抑制方法的方法相比,在linemod,lineMod,lineMod oxplusion和homebreweddb数据集中进行了最新性能。

Most successful approaches to estimate the 6D pose of an object typically train a neural network by supervising the learning with annotated poses in real world images. These annotations are generally expensive to obtain and a common workaround is to generate and train on synthetic scenes, with the drawback of limited generalisation when the model is deployed in the real world. In this work, a two-stage 6D object pose estimator framework that can be applied on top of existing neural-network-based approaches and that does not require pose annotations on real images is proposed. The first self-supervised stage enforces the pose consistency between rendered predictions and real input images, narrowing the gap between the two domains. The second stage fine-tunes the previously trained model by enforcing the photometric consistency between pairs of different object views, where one image is warped and aligned to match the view of the other and thus enabling their comparison. In the absence of both real image annotations and depth information, applying the proposed framework on top of two recent approaches results in state-of-the-art performance when compared to methods trained only on synthetic data, domain adaptation baselines and a concurrent self-supervised approach on LINEMOD, LINEMOD OCCLUSION and HomebrewedDB datasets.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源