注意3D重建的基于金字塔的成本量的多视图立体网络

论文标题

注意3D重建的基于金字塔的成本量的多视图立体网络

Attention Aware Cost Volume Pyramid Based Multi-view Stereo Network for 3D Reconstruction

论文作者

Yu, Anzhu, Guo, Wenyue, Liu, Bing, Chen, Xin, Wang, Xin, Cao, Xuefeng, Jiang, Bingchuan

论文摘要

我们提出了一个有效的多视图立体声（MVS）网络，用于从多视图中进行3D重建。尽管以前的基于学习的重建方法的表现良好，但大多数使用平面扫描量使用具有固定深度假设的平面量估算固定分辨率的深度图，这需要在每个平面上进行固定的深度假设，这需要密集采样的平面才能获得所需的准确性，因此很难获得高分辨率深度图。在本文中，我们引入了Coarseto-Fine深度推理策略，以实现高分辨率深度。该策略估计了最高水平的深度图，而较细胞的深度图被视为从上一个级别的上采样深度图，并具有像素深度为残留。因此，我们将深度搜索范围从先前级别的先验信息缩小范围，并从像素深度残留物中构建新的成本量，以执行深度图改进。然后，由于所有参数在不同级别之间共享，因此可以迭代地实现最终深度图。在每个级别，将自发注意力层引入特征提取块，以捕获深度推理任务的远距离依赖性，并且使用相似性测量而不是先前工作中使用的基于方差的方法生成成本量。实验均在DTU基准数据集上进行，并最近发布了BlendenDMVS数据集。结果表明，我们的模型可以胜过大多数最新方法（SOTA）方法。该项目的代码库在https://github.com/arthasmil/aacvp-mvsnet上。

We present an efficient multi-view stereo (MVS) network for 3D reconstruction from multiview images. While previous learning based reconstruction approaches performed quite well, most of them estimate depth maps at a fixed resolution using plane sweep volumes with a fixed depth hypothesis at each plane, which requires densely sampled planes for desired accuracy and therefore is difficult to achieve high resolution depth maps. In this paper we introduce a coarseto-fine depth inference strategy to achieve high resolution depth. This strategy estimates the depth map at coarsest level, while the depth maps at finer levels are considered as the upsampled depth map from previous level with pixel-wise depth residual. Thus, we narrow the depth searching range with priori information from previous level and construct new cost volumes from the pixel-wise depth residual to perform depth map refinement. Then the final depth map could be achieved iteratively since all the parameters are shared between different levels. At each level, the self-attention layer is introduced to the feature extraction block for capturing the long range dependencies for depth inference task, and the cost volume is generated using similarity measurement instead of the variance based methods used in previous work. Experiments were conducted on both the DTU benchmark dataset and recently released BlendedMVS dataset. The results demonstrated that our model could outperform most state-of-the-arts (SOTA) methods. The codebase of this project is at https://github.com/ArthasMil/AACVP-MVSNet.

下载PDF全文

下载文献需遵守相关版权规定

论文标题