3D-FLOWNET：基于事件的光流估计，具有3D表示

论文标题

3D-FLOWNET：基于事件的光流估计，具有3D表示

3D-FlowNet: Event-based optical flow estimation with 3D representation

论文作者

Sun, Haixin, Dao, Minh-Quan, Fremont, Vincent

论文摘要

基于事件的摄像机可以在低照明条件下自动驾驶汽车在自动驾驶汽车导航期间高速运动检测等重要任务的基于框架的限制。事件摄像机的高时间分辨率和高动态范围，使它们可以快速运动和极端光线场景工作。但是，传统的计算机视觉方法（例如深神经网络）并不能很好地适应事件数据，因为它们是异步和离散的。此外，事件数据的传统2D编码表示方法，牺牲时间分辨率。在本文中，我们首先通过将其扩展到三个维度来更好地保留事件的时间分布来改善2D编码表示。然后，我们提出了3D-Flownet，这是一种新型的网络体系结构，可以根据新的编码方法处理3D输入表示和输出光流估计。采用了一种自我监督的培训策略来补偿基于事件的相机缺乏标签的数据集。最后，通过多车立体摄像机（MVSEC）数据集对提出的网络进行了训练和评估。结果表明，我们的3D-Flownet的表现优于最先进的方法，训练时期较少（比Spike-Flownet的100个）。

Event-based cameras can overpass frame-based cameras limitations for important tasks such as high-speed motion detection during self-driving cars navigation in low illumination conditions. The event cameras' high temporal resolution and high dynamic range, allow them to work in fast motion and extreme light scenarios. However, conventional computer vision methods, such as Deep Neural Networks, are not well adapted to work with event data as they are asynchronous and discrete. Moreover, the traditional 2D-encoding representation methods for event data, sacrifice the time resolution. In this paper, we first improve the 2D-encoding representation by expanding it into three dimensions to better preserve the temporal distribution of the events. We then propose 3D-FlowNet, a novel network architecture that can process the 3D input representation and output optical flow estimations according to the new encoding methods. A self-supervised training strategy is adopted to compensate the lack of labeled datasets for the event-based camera. Finally, the proposed network is trained and evaluated with the Multi-Vehicle Stereo Event Camera (MVSEC) dataset. The results show that our 3D-FlowNet outperforms state-of-the-art approaches with less training epoch (30 compared to 100 of Spike-FlowNet).

下载PDF全文

下载文献需遵守相关版权规定

论文标题