3Dmodt：3D点云中联合检测和跟踪的注意引导亲和力

论文标题

3Dmodt：3D点云中联合检测和跟踪的注意引导亲和力

3DMODT: Attention-Guided Affinities for Joint Detection & Tracking in 3D Point Clouds

论文作者

Kini, Jyoti, Mian, Ajmal, Shah, Mubarak

论文摘要

我们提出了一种在3D点云中对多个对象进行联合检测和跟踪的方法，该任务通常将其视为包含对象检测的两步过程，然后是数据关联。我们的方法将这两个步骤嵌入到一个端到端可训练网络中，以消除对外部对象检测器的依赖性。我们的模型利用了使用多个帧来检测对象并在单个网络中跟踪它们的时间信息，从而使其成为现实情况下的实用公式。通过在连续点云扫描之间采用特征相似性来计算亲和力矩阵构成了视觉跟踪的组成部分。我们提出了一个基于注意力的改进模块，以通过抑制错误的对应关系来完善亲和力矩阵。该模块旨在通过在每个亲和力矩阵中采用自我注意力并在一对亲和力矩阵中采用自我注意力来捕获亲和力矩阵中的全球环境。与竞争方法不同，我们的网络不需要复杂的后处理算法，并且可以处理原始激光雷达帧直接输出跟踪结果。我们证明了我们的方法对三个跟踪基准的有效性：JRDB，Waymo和Kitti。实验评估表明我们的模型跨数据集良好概括的能力。

We propose a method for joint detection and tracking of multiple objects in 3D point clouds, a task conventionally treated as a two-step process comprising object detection followed by data association. Our method embeds both steps into a single end-to-end trainable network eliminating the dependency on external object detectors. Our model exploits temporal information employing multiple frames to detect objects and track them in a single network, thereby making it a utilitarian formulation for real-world scenarios. Computing affinity matrix by employing features similarity across consecutive point cloud scans forms an integral part of visual tracking. We propose an attention-based refinement module to refine the affinity matrix by suppressing erroneous correspondences. The module is designed to capture the global context in affinity matrix by employing self-attention within each affinity matrix and cross-attention across a pair of affinity matrices. Unlike competing approaches, our network does not require complex post-processing algorithms, and processes raw LiDAR frames to directly output tracking results. We demonstrate the effectiveness of our method on the three tracking benchmarks: JRDB, Waymo, and KITTI. Experimental evaluations indicate the ability of our model to generalize well across datasets.

下载PDF全文

下载文献需遵守相关版权规定

论文标题