强大的一致视频深度估计

论文标题

强大的一致视频深度估计

Robust Consistent Video Depth Estimation

论文作者

Kopf, Johannes, Rong, Xuejian, Huang, Jia-Bin

论文摘要

我们提出了一种算法，用于估计单眼视频一致的密集深度图和相机姿势。我们以卷积神经网络的形式集成了基于学习的深度，该网络训练了单像深度估计，并进行了几何优化，以估算光滑的摄像头轨迹以及详细且稳定的深度重建。我们的算法结合了两种互补技术：（1）低频大规模比对的柔性变形调节，以及（2）几何学意识深度过滤，以高频对齐细节。与先前的方法相反，我们的方法不需要相机姿势，因为输入并实现了强大的重建，以挑战手持手机捕获包含大量噪音，摇动，运动模糊和滚动快门变形的手机。我们的方法在Sintel基准测试上的最高效果上都超过了最先进的方法，并且在各种野生数据集中都能达到姿势估计，并在各种野生数据集中获得有利的定性结果。

We present an algorithm for estimating consistent dense depth maps and camera poses from a monocular video. We integrate a learning-based depth prior, in the form of a convolutional neural network trained for single-image depth estimation, with geometric optimization, to estimate a smooth camera trajectory as well as detailed and stable depth reconstruction. Our algorithm combines two complementary techniques: (1) flexible deformation-splines for low-frequency large-scale alignment and (2) geometry-aware depth filtering for high-frequency alignment of fine depth details. In contrast to prior approaches, our method does not require camera poses as input and achieves robust reconstruction for challenging hand-held cell phone captures containing a significant amount of noise, shake, motion blur, and rolling shutter deformations. Our method quantitatively outperforms state-of-the-arts on the Sintel benchmark for both depth and pose estimations and attains favorable qualitative results across diverse wild datasets.

下载PDF全文

下载文献需遵守相关版权规定

论文标题