使用多尺度神经网络和特征融合的单眼深度估计

论文标题

使用多尺度神经网络和特征融合的单眼深度估计

Monocular Depth Estimation Using Multi Scale Neural Network And Feature Fusion

论文作者

Sagar, Abhinav

论文摘要

在计算机视觉中，来自单眼图像的深度估计是一个具有挑战性的问题。在本文中，我们使用多秤功能融合使用新型网络体系结构来解决此问题。我们的网络使用两个不同的块，首先使用不同的滤波器大小进行卷积并合并所有单个特征图。第二块使用扩张的卷积代替完全连接的层，从而减少了计算并增加了接受场。我们为训练网络提供了一个新的损失函数，该函数使用深度回归项，SSIM损失项和多项式逻辑损失项的组合。我们使用标准评估指标训练和测试我们的3D数据集，NYU深度V2数据集和KITTI数据集的网络，以构成由RMSE损失和Silog损失组成的深度估计。我们的网络的表现优于先前的ART方法，其参数较小。

Depth estimation from monocular images is a challenging problem in computer vision. In this paper, we tackle this problem using a novel network architecture using multi scale feature fusion. Our network uses two different blocks, first which uses different filter sizes for convolution and merges all the individual feature maps. The second block uses dilated convolutions in place of fully connected layers thus reducing computations and increasing the receptive field. We present a new loss function for training the network which uses a depth regression term, SSIM loss term and a multinomial logistic loss term combined. We train and test our network on Make 3D dataset, NYU Depth V2 dataset and Kitti dataset using standard evaluation metrics for depth estimation comprised of RMSE loss and SILog loss. Our network outperforms previous state of the art methods with lesser parameters.

下载PDF全文

下载文献需遵守相关版权规定

论文标题