图像到$ \ mathrm {so}（3）$对象推理的iCosahedral投影

论文标题

图像到$ \ mathrm {so}（3）$对象推理的iCosahedral投影

Image to Icosahedral Projection for $\mathrm{SO}(3)$ Object Reasoning from Single-View Images

论文作者

Klee, David, Biza, Ondrej, Platt, Robert, Walters, Robin

论文摘要

基于2D图像的3D对象的推理由于从不同方向查看对象引起的外观变化而具有挑战性。诸如对象分类之类的任务是3D旋转的不变，而其他姿势估计等等。但是，使用2D图像输入将其施加于模型约束通常是不可能的，因为我们没有一个先验模型，即图像在平面外对象旋转下如何变化。唯一的$ \ mathrm {so}（3）$ - 当前存在的模型需要点云或体voxel输入而不是2D图像。在本文中，我们提出了一种基于二十面体群卷积的新颖体系结构，该架构是通过学习对Icosahedron上的输入图像的投影，以$ \ mathrm {so（3）} $中的理由。最终的模型大致与$ \ mathrm {so}（3）$中的旋转大致相同。我们将此模型应用于对象构成估计和形状分类任务，并发现它的表现优于合理的基准。项目网站：\ url {https://dmklee.github.io/image2icosahedral}

Reasoning about 3D objects based on 2D images is challenging due to variations in appearance caused by viewing the object from different orientations. Tasks such as object classification are invariant to 3D rotations and other such as pose estimation are equivariant. However, imposing equivariance as a model constraint is typically not possible with 2D image input because we do not have an a priori model of how the image changes under out-of-plane object rotations. The only $\mathrm{SO}(3)$-equivariant models that currently exist require point cloud or voxel input rather than 2D images. In this paper, we propose a novel architecture based on icosahedral group convolutions that reasons in $\mathrm{SO(3)}$ by learning a projection of the input image onto an icosahedron. The resulting model is approximately equivariant to rotation in $\mathrm{SO}(3)$. We apply this model to object pose estimation and shape classification tasks and find that it outperforms reasonable baselines. Project website: \url{https://dmklee.github.io/image2icosahedral}

下载PDF全文

下载文献需遵守相关版权规定

论文标题