论文标题
曲线形式:曲线传播的3D车道检测和曲线查询和注意力
CurveFormer: 3D Lane Detection by Curve Propagation with Curve Queries and Attention
论文作者
论文摘要
3D车道检测是自动驾驶系统不可或缺的一部分。以前的CNN和基于变压器的方法通常首先从前视图图像中生成鸟类视图(BEV)特征图,然后使用带有BEV功能映射的子网络作为预测3D车道的输入。这种方法需要在BEV和前视图之间进行明确的视图转换,这本身仍然是一个具有挑战性的问题。在本文中,我们提出了一种基于单阶段变压器的方法,该方法直接计算3D车道参数并可以规避困难的视图变换步骤。具体而言,我们通过使用曲线查询来将3D车道检测作为曲线传播问题。 3D车道查询由动态和有序的锚点集表示。通过这种方式,在变压器解码器迭代中具有曲线表示的查询可以很好地完善3D车道检测结果。此外,引入了曲线跨界模块,以计算曲线查询和图像特征之间的相似性。此外,提供了可以捕获曲线查询的更多相对图像特征的上下文采样模块,以进一步提高3D车道检测性能。我们评估了合成数据集和现实世界数据集的3D车道检测方法,实验结果表明,与最先进的方法相比,我们的方法实现了有希望的性能。每个组件的有效性也通过消融研究验证。
3D lane detection is an integral part of autonomous driving systems. Previous CNN and Transformer-based methods usually first generate a bird's-eye-view (BEV) feature map from the front view image, and then use a sub-network with BEV feature map as input to predict 3D lanes. Such approaches require an explicit view transformation between BEV and front view, which itself is still a challenging problem. In this paper, we propose CurveFormer, a single-stage Transformer-based method that directly calculates 3D lane parameters and can circumvent the difficult view transformation step. Specifically, we formulate 3D lane detection as a curve propagation problem by using curve queries. A 3D lane query is represented by a dynamic and ordered anchor point set. In this way, queries with curve representation in Transformer decoder iteratively refine the 3D lane detection results. Moreover, a curve cross-attention module is introduced to compute the similarities between curve queries and image features. Additionally, a context sampling module that can capture more relative image features of a curve query is provided to further boost the 3D lane detection performance. We evaluate our method for 3D lane detection on both synthetic and real-world datasets, and the experimental results show that our method achieves promising performance compared with the state-of-the-art approaches. The effectiveness of each component is validated via ablation studies as well.