构图上可推广的3D结构预测

论文标题

构图上可推广的3D结构预测

Compositionally Generalizable 3D Structure Prediction

论文作者

Han, Songfang, Gu, Jiayuan, Mo, Kaichun, Yi, Li, Hu, Siyu, Chen, Xuejin, Su, Hao

论文摘要

单位图3D形状重建是计算机视觉中的重要且长期存在的问题。众多现有作品不断地推动深度学习时代的最新表现。但是，关于如何将学到的技能概括为具有截然不同的形状几何分布的未看到的对象类别，仍然存在更加困难和探索的问题。在本文中，我们提出了构图概括性的概念，并提出了一个可以更好地推广到这些看不见类别的新颖框架。我们将3D形状的重建问题分配到适当的子问题中，每个问题都通过经过精心设计的神经子模块来解决，并具有概括性问题。我们的配方背后的直觉是，对象部分（板岩和圆柱部分），其关系（邻接和翻译对称性）以及形状的子结构（T-界面和对称零件组）主要在对象类别中共享，即使对象几何形状看起来非常不同（例如，椅子和橱柜）。 Partnet上的实验表明，我们取得了比最先进的表现更高的性能。这验证了我们的问题分解和网络设计。

Single-image 3D shape reconstruction is an important and long-standing problem in computer vision. A plethora of existing works is constantly pushing the state-of-the-art performance in the deep learning era. However, there remains a much more difficult and under-explored issue on how to generalize the learned skills over unseen object categories that have very different shape geometry distributions. In this paper, we bring in the concept of compositional generalizability and propose a novel framework that could better generalize to these unseen categories. We factorize the 3D shape reconstruction problem into proper sub-problems, each of which is tackled by a carefully designed neural sub-module with generalizability concerns. The intuition behind our formulation is that object parts (slates and cylindrical parts), their relationships (adjacency and translation symmetry), and shape substructures (T-junctions and a symmetric group of parts) are mostly shared across object categories, even though object geometries may look very different (e.g. chairs and cabinets). Experiments on PartNet show that we achieve superior performance than state-of-the-art. This validates our problem factorization and network designs.

下载PDF全文

下载文献需遵守相关版权规定

论文标题