动员：移动设备上有效的单眼深度预测

论文标题

动员：移动设备上有效的单眼深度预测

MobileDepth: Efficient Monocular Depth Prediction on Mobile Devices

论文作者

Wang, Yekai

论文摘要

深度预测对于计算机视觉和机器人系统的许多有用应用是基本的。在手机上，可以通过准确的深度预测来增强一些有用的应用程序（例如增强现实，自动对焦等）的性能。在这项工作中，已经提出了一个有效的全面卷积网络体系结构，用于深度预测，该架构将Regnety 06用作编码器和分裂偶然的混合块作为解码器。同时，已经提供了数据增强，超参数和损失功能的适当组合，以有效地训练轻质网络。此外，已经开发了一个Android应用程序，该应用程序可以加载CNN模型，以通过从移动摄像机捕获的单眼图像预测深度图，并评估模型的平均延迟和帧。结果，该网络在NYU深度V2数据集上达到了82.7％的δ1精度，同时，ARM A76 CPU的延迟仅为62ms，因此它可以实时预测移动相机的深度映射。

Depth prediction is fundamental for many useful applications on computer vision and robotic systems. On mobile phones, the performance of some useful applications such as augmented reality, autofocus and so on could be enhanced by accurate depth prediction. In this work, an efficient fully convolutional network architecture for depth prediction has been proposed, which uses RegNetY 06 as the encoder and split-concatenate shuffle blocks as decoder. At the same time, an appropriate combination of data augmentation, hyper-parameters and loss functions to efficiently train the lightweight network has been provided. Also, an Android application has been developed which can load CNN models to predict depth map by the monocular images captured from the mobile camera and evaluate the average latency and frame per second of the models. As a result, the network achieves 82.7% δ1 accuracy on NYU Depth v2 dataset and at the same time, have only 62ms latency on ARM A76 CPUs so that it can predict the depth map from the mobile camera in real-time.

下载PDF全文

下载文献需遵守相关版权规定

论文标题