RAFT-3D：使用刚性嵌入的场景流动

论文标题

RAFT-3D：使用刚性嵌入的场景流动

RAFT-3D: Scene Flow using Rigid-Motion Embeddings

论文作者

Teed, Zachary, Deng, Jia

论文摘要

我们解决场景流的问题：给定一对立体声或RGB-D视频帧，估计Pixelwise 3D运动。我们介绍了Raft-3D，这是一种新的深层架构，用于场景流。 RAFT-3D基于为光流开发的RAFT模型，但迭代地更新了Pixelwise SE3运动而不是2D运动的密集场。 RAFT-3D的关键创新是刚性动作嵌入，它代表了像素的软分组成刚性对象。与刚性嵌入的积分是密集的-se3，这是一个可实现嵌入几何一致性的可区分层。实验表明，RAFT-3D实现了最先进的性能。在《飞行3D》中，根据两视图评估，我们将最佳发布的准确性（d <0.05）从34.3％提高到83.7％。在KITTI上，我们达到了5.77的错误，尽管没有使用对象实例监督，但尽管没有使用对象实例的监督，但表现优于最佳发布方法（6.31）。代码可在https://github.com/princeton-vl/raft-3d上找到。

We address the problem of scene flow: given a pair of stereo or RGB-D video frames, estimate pixelwise 3D motion. We introduce RAFT-3D, a new deep architecture for scene flow. RAFT-3D is based on the RAFT model developed for optical flow but iteratively updates a dense field of pixelwise SE3 motion instead of 2D motion. A key innovation of RAFT-3D is rigid-motion embeddings, which represent a soft grouping of pixels into rigid objects. Integral to rigid-motion embeddings is Dense-SE3, a differentiable layer that enforces geometric consistency of the embeddings. Experiments show that RAFT-3D achieves state-of-the-art performance. On FlyingThings3D, under the two-view evaluation, we improved the best published accuracy (d < 0.05) from 34.3% to 83.7%. On KITTI, we achieve an error of 5.77, outperforming the best published method (6.31), despite using no object instance supervision. Code is available at https://github.com/princeton-vl/RAFT-3D.

下载PDF全文

下载文献需遵守相关版权规定

论文标题