基于语义细分的复杂道路和交通情况的有效的文本解释

论文标题

基于语义细分的复杂道路和交通情况的有效的文本解释

Efficient textual explanations for complex road and traffic scenarios based on semantic segmentation

论文作者

Zhao, Yiyue, Yun, Xinyu, Chai, Chen, Liu, Zhiyu, Fan, Wenxuan, Luo, Xiao

论文摘要

复杂的驾驶环境给自动驾驶汽车的视觉感知带来了巨大挑战。从复杂的道路和交通场景中提取清晰可解释的信息至关重要，并为决策和控制提供线索。但是，以前的场景说明已作为单独的模型实现。黑匣子模型使解释驾驶环境变得困难。它无法检测到全面的文本信息，需要高度的计算负载和时间消耗。因此，这项研究提出了一个全面有效的文本解释模型。从驾驶环境的336K视频框架中，选择复杂道路和交通情况的关键图像被选为数据集。通过转移学习，这项研究建立了一个准确有效的分割模型，以获取环境中的关键流量要素。基于XGBoost算法，开发了一个综合模型。该模型提供了有关流量元素状态，冲突对象的运动和场景复杂性的文本信息。该方法在现实世界的道路上进行了验证。它提高了关键流量元素的感知准确性至78.8％。每个时期的时间消耗达到13分钟，效率是预训练的网络的11.5倍。从模型分析的文本信息也符合现实。这些发现提供了有关复杂驾驶环境的清晰可解释的信息，该信息为后续决策和控制奠定了基础。它可以提高视觉感知能力，并丰富复杂交通情况的先验知识和判断。

The complex driving environment brings great challenges to the visual perception of autonomous vehicles. It's essential to extract clear and explainable information from the complex road and traffic scenarios and offer clues to decision and control. However, the previous scene explanation had been implemented as a separate model. The black box model makes it difficult to interpret the driving environment. It cannot detect comprehensive textual information and requires a high computational load and time consumption. Thus, this study proposed a comprehensive and efficient textual explanation model. From 336k video frames of the driving environment, critical images of complex road and traffic scenarios were selected into a dataset. Through transfer learning, this study established an accurate and efficient segmentation model to obtain the critical traffic elements in the environment. Based on the XGBoost algorithm, a comprehensive model was developed. The model provided textual information about states of traffic elements, the motion of conflict objects, and scenario complexity. The approach was verified on the real-world road. It improved the perception accuracy of critical traffic elements to 78.8%. The time consumption reached 13 minutes for each epoch, which was 11.5 times more efficient than the pre-trained network. The textual information analyzed from the model was also accordant with reality. The findings offer clear and explainable information about the complex driving environment, which lays a foundation for subsequent decision and control. It can improve the visual perception ability and enrich the prior knowledge and judgments of complex traffic situations.

下载PDF全文

下载文献需遵守相关版权规定

论文标题