输送器：基于变压器的传感器融合进行自动驾驶的模仿

论文标题

输送器：基于变压器的传感器融合进行自动驾驶的模仿

TransFuser: Imitation with Transformer-Based Sensor Fusion for Autonomous Driving

论文作者

Chitta, Kashyap, Prakash, Aditya, Jaeger, Bernhard, Yu, Zehao, Renz, Katrin, Geiger, Andreas

论文摘要

我们应该如何整合互补传感器的表示形式以进行自动驾驶？基于几何的融合已显示出对感知的希望（例如对象检测，运动预测）。但是，在端到端驾驶的背景下，我们发现基于现有传感器融合方法的模仿学习在具有高密度动态剂的复杂驾驶场景中表现不佳。因此，我们提出了使用自我注意力来整合图像和激光雷达表示的机制。我们的方法使用多个分辨率的变压器模块来融合透视图和鸟类视图功能图。我们通过实验性地验证了其在富有挑战性的新基准测试基准的效力，并通过长路线和茂密的交通以及Carla Urban Drive Simulator的官方排行榜。在提交时，在驾驶得分方面的驱动力得分很大的方面，Foffuser的表现优于Carla排行榜上的所有先前工作。与基于几何融合的融合相比，输血器将平均每公里的平均碰撞降低48％。

How should we integrate representations from complementary sensors for autonomous driving? Geometry-based fusion has shown promise for perception (e.g. object detection, motion forecasting). However, in the context of end-to-end driving, we find that imitation learning based on existing sensor fusion methods underperforms in complex driving scenarios with a high density of dynamic agents. Therefore, we propose TransFuser, a mechanism to integrate image and LiDAR representations using self-attention. Our approach uses transformer modules at multiple resolutions to fuse perspective view and bird's eye view feature maps. We experimentally validate its efficacy on a challenging new benchmark with long routes and dense traffic, as well as the official leaderboard of the CARLA urban driving simulator. At the time of submission, TransFuser outperforms all prior work on the CARLA leaderboard in terms of driving score by a large margin. Compared to geometry-based fusion, TransFuser reduces the average collisions per kilometer by 48%.

下载PDF全文

下载文献需遵守相关版权规定

论文标题