论文标题

RAFT-3D:使用刚性嵌入的场景流动

RAFT-3D: Scene Flow using Rigid-Motion Embeddings

论文作者

Teed, Zachary, Deng, Jia

论文摘要

我们解决场景流的问题:给定一对立体声或RGB-D视频帧,估计Pixelwise 3D运动。我们介绍了Raft-3D,这是一种新的深层架构,用于场景流。 RAFT-3D基于为光流开发的RAFT模型,但迭代地更新了Pixelwise SE3运动而不是2D运动的密集场。 RAFT-3D的关键创新是刚性动作嵌入,它代表了像素的软分组成刚性对象。与刚性嵌入的积分是密集的-se3,这是一个可实现嵌入几何一致性的可区分层。实验表明,RAFT-3D实现了最先进的性能。在《飞行3D》中,根据两视图评估,我们将最佳发布的准确性(d <0.05)从34.3%提高到83.7%。在KITTI上,我们达到了5.77的错误,尽管没有使用对象实例监督,但尽管没有使用对象实例的监督,但表现优于最佳发布方法(6.31)。代码可在https://github.com/princeton-vl/raft-3d上找到。

We address the problem of scene flow: given a pair of stereo or RGB-D video frames, estimate pixelwise 3D motion. We introduce RAFT-3D, a new deep architecture for scene flow. RAFT-3D is based on the RAFT model developed for optical flow but iteratively updates a dense field of pixelwise SE3 motion instead of 2D motion. A key innovation of RAFT-3D is rigid-motion embeddings, which represent a soft grouping of pixels into rigid objects. Integral to rigid-motion embeddings is Dense-SE3, a differentiable layer that enforces geometric consistency of the embeddings. Experiments show that RAFT-3D achieves state-of-the-art performance. On FlyingThings3D, under the two-view evaluation, we improved the best published accuracy (d < 0.05) from 34.3% to 83.7%. On KITTI, we achieve an error of 5.77, outperforming the best published method (6.31), despite using no object instance supervision. Code is available at https://github.com/princeton-vl/RAFT-3D.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源