使用全对相关性利用对应关系，以进行多视图深度估计

论文标题

使用全对相关性利用对应关系，以进行多视图深度估计

Exploiting Correspondences with All-pairs Correlations for Multi-view Depth Estimation

论文作者

Cheng, Kai, Chen, Hao, Yin, Wei, Xu, Guangkai, Chen, Xuejin

论文摘要

多视图深度估计在重建和理解3D世界中起着至关重要的作用。最近的基于学习的方法在IT上取得了重大进展。但是，多视图深度估计从根本上是一个基于对应的优化问题，但是先前基于学习的方法主要依赖于预定义的深度假设来构建对应关系，作为成本量，并隐式地将其正直地将其正则定于适合深度预测，偏离了基于刻板信号的迭代优化的本质。因此，它们具有不令人满意的精度和概括能力。在本文中，我们是第一个探索更通用的图像相关性的人，以动态建立对应关系以进行深度估计。我们设计了一个新颖的迭代多视图深度估计框架，模仿了优化过程，该框架由1）一个相关量构造模块组成，该模块将参考图像和源图像之间的像素相似性建模为全相关性； 2）基于流量的深度初始化模块，该模块估算了2D光流的深度； 3）一种新型的相关性引导的深度细化模块，该模块在不同的视图中重新投影以有效地获取相关的相关性，以进一步融合并整合了融合的相关性以进行迭代深度更新。没有预定义的深度假设，融合的相关性以有效的方式建立了多视图对应关系，并以启发性的方式指导深度细化。我们在扫描，恶魔，ETH3D和7SCENES上进行了足够的实验，以证明我们方法对多视图深度估计的优越性及其最佳的概括能力。

Multi-view depth estimation plays a critical role in reconstructing and understanding the 3D world. Recent learning-based methods have made significant progress in it. However, multi-view depth estimation is fundamentally a correspondence-based optimization problem, but previous learning-based methods mainly rely on predefined depth hypotheses to build correspondence as the cost volume and implicitly regularize it to fit depth prediction, deviating from the essence of iterative optimization based on stereo correspondence. Thus, they suffer unsatisfactory precision and generalization capability. In this paper, we are the first to explore more general image correlations to establish correspondences dynamically for depth estimation. We design a novel iterative multi-view depth estimation framework mimicking the optimization process, which consists of 1) a correlation volume construction module that models the pixel similarity between a reference image and source images as all-to-all correlations; 2) a flow-based depth initialization module that estimates the depth from the 2D optical flow; 3) a novel correlation-guided depth refinement module that reprojects points in different views to effectively fetch relevant correlations for further fusion and integrate the fused correlation for iterative depth update. Without predefined depth hypotheses, the fused correlations establish multi-view correspondence in an efficient way and guide the depth refinement heuristically. We conduct sufficient experiments on ScanNet, DeMoN, ETH3D, and 7Scenes to demonstrate the superiority of our method on multi-view depth estimation and its best generalization ability.

下载PDF全文

下载文献需遵守相关版权规定

论文标题