通过深度检测可能的旅行方向，从单个360°图像的交点预测

论文标题

通过深度检测可能的旅行方向，从单个360°图像的交点预测

Intersection Prediction from Single 360° Image via Deep Detection of Possible Direction of Travel

论文作者

Sugimoto, Naoki, Ikehata, Satoshi, Aizawa, Kiyoharu

论文摘要

电影图是一款交互式第一人称视角地图，吸引用户参与模拟的步行体验，包括短的360°视频段，该视频段被交通交叉点所分隔，这些段是根据观众的旅行方向无缝连接的。但是，在众多相交道路的广泛城市规模的地区，手动交叉分割需要大量的人类努力。因此，自动识别来自360°视频的交叉点是扩展电影图的重要问题。在本文中，我们提出了一种新颖的方法，该方法可以识别360°视频中各个帧的交集。我们没有将交点识别作为标准二进制分类任务以360°图像作为输入，而是根据从单个360°图像中投射出的八个方向，可根据神经网络检测到的各种类型的交叉点，在八个方向上识别出可能的旅行方向（PDOT）的数量。我们构建了一个大规模的360°图像交叉点标识（III360）数据集进行培训和评估，其中从学校校园，校园，市区，郊区和中国城镇等各个领域收集了360°视频，并证明我们基于PDOT的方法可以实现88％的准确性，而不是直接基于Naive Naive bialitive binary Binary Binary Bily Binary Bialtivie cartivie。发表论文后，源代码和部分数据集将在社区中共享。

Movie-Map, an interactive first-person-view map that engages the user in a simulated walking experience, comprises short 360° video segments separated by traffic intersections that are seamlessly connected according to the viewer's direction of travel. However, in wide urban-scale areas with numerous intersecting roads, manual intersection segmentation requires significant human effort. Therefore, automatic identification of intersections from 360° videos is an important problem for scaling up Movie-Map. In this paper, we propose a novel method that identifies an intersection from individual frames in 360° videos. Instead of formulating the intersection identification as a standard binary classification task with a 360° image as input, we identify an intersection based on the number of the possible directions of travel (PDoT) in perspective images projected in eight directions from a single 360° image detected by the neural network for handling various types of intersections. We constructed a large-scale 360° Image Intersection Identification (iii360) dataset for training and evaluation where 360° videos were collected from various areas such as school campus, downtown, suburb, and china town and demonstrate that our PDoT-based method achieves 88\% accuracy, which is significantly better than that achieved by the direct naive binary classification based method. The source codes and a partial dataset will be shared in the community after the paper is published.

下载PDF全文

下载文献需遵守相关版权规定

论文标题