比快速快：高速VIO的GPU加速前端

论文标题

比快速快：高速VIO的GPU加速前端

Faster than FAST: GPU-Accelerated Frontend for High-Speed VIO

论文作者

Nagy, Balazs, Foehn, Philipp, Scaramuzza, Davide

论文摘要

最近引入功能强大的嵌入式图形处理单元（GPU）允许实时计算机视觉应用程序的改进。它使算法能够在船上运行，远高于标准视频速率，不仅可以产生更高的信息处理能力，还可以减少延迟。这项工作着重于有效的低级，GPU硬件特定说明的适用性，以改善视觉持续频道（VIO）领域现有的计算机视觉算法。尽管VIO管道的大多数步骤都在视觉特征上工作，但它们依靠图像数据进行检测和跟踪，其中这两个步骤都非常适合并行化。尤其是非马克西马抑制作用和随后的特征选择是整体图像处理延迟的重要贡献者。我们的工作首先重新审视了专门在GPU上的特征检测的非马西马抑制的问题，并提出了一种选择局部响应最大值，施加空间特征分布并同时提取特征的解决方案。我们的第二个贡献引入了增强的快速特征检测器，该特征检测器采用了上述非马西马抑制方法。最后，我们将我们的方法与其他最先进的CPU和GPU实现进行了比较，在该实现中，我们始终在功能跟踪和检测中优于所有这些方法，从而在嵌入式Jetson TX2平台上产生了超过1000fps的吞吐量。此外，我们演示了我们集成在VIO管道中的工作，该公司达到了〜200fps的度量状态估计。

The recent introduction of powerful embedded graphics processing units (GPUs) has allowed for unforeseen improvements in real-time computer vision applications. It has enabled algorithms to run onboard, well above the standard video rates, yielding not only higher information processing capability, but also reduced latency. This work focuses on the applicability of efficient low-level, GPU hardware-specific instructions to improve on existing computer vision algorithms in the field of visual-inertial odometry (VIO). While most steps of a VIO pipeline work on visual features, they rely on image data for detection and tracking, of which both steps are well suited for parallelization. Especially non-maxima suppression and the subsequent feature selection are prominent contributors to the overall image processing latency. Our work first revisits the problem of non-maxima suppression for feature detection specifically on GPUs, and proposes a solution that selects local response maxima, imposes spatial feature distribution, and extracts features simultaneously. Our second contribution introduces an enhanced FAST feature detector that applies the aforementioned non-maxima suppression method. Finally, we compare our method to other state-of-the-art CPU and GPU implementations, where we always outperform all of them in feature tracking and detection, resulting in over 1000fps throughput on an embedded Jetson TX2 platform. Additionally, we demonstrate our work integrated in a VIO pipeline achieving a metric state estimation at ~200fps.

下载PDF全文

下载文献需遵守相关版权规定

论文标题