一种用于安全有效的自动驾驶汽车安全有效行为计划的多机构增强学习方法

论文标题

一种用于安全有效的自动驾驶汽车安全有效行为计划的多机构增强学习方法

A Multi-Agent Reinforcement Learning Approach For Safe and Efficient Behavior Planning Of Connected Autonomous Vehicles

论文作者

Han, Songyang, Zhou, Shanglin, Wang, Jiangwei, Pepin, Lynn, Ding, Caiwen, Fu, Jie, Miao, Fei

论文摘要

无线技术的最新进步使连接的自动驾驶汽车（CAVS）能够通过车辆到车辆（V2V）通信收集有关其环境的信息。在这项工作中，我们为CAVS设计了一个基于信息共享的多代理增援学习（MARL）框架，以在做出决定以提高交通效率和安全性时利用额外的信息。我们建议的安全参与者批评算法有两种新技术：截断的Q功能和安全的动作映射。截断的Q功能利用了来自相邻骑士的共享信息，以使Q-功能的联合状态和动作空间在我们的算法中不会在大规模CAV系统中生长。我们证明了截短Q和全局Q函数之间近似误差的结合。安全的操作映射为基于控制障碍功能的培训和执行提供了可证明的安全保证。使用CARLA模拟器进行实验，我们表明我们的方法可以在不同的CAV比和不同的交通密度下的平均速度和舒适度来提高CAV系统的效率。我们还表明，我们的方法避免执行不安全的行动，并始终与其他车辆保持安全距离。我们构建了一个障碍物的场景，以表明共同的愿景可以帮助骑士更早地观察障碍，并采取行动避免交通拥堵。

The recent advancements in wireless technology enable connected autonomous vehicles (CAVs) to gather information about their environment by vehicle-to-vehicle (V2V) communication. In this work, we design an information-sharing-based multi-agent reinforcement learning (MARL) framework for CAVs, to take advantage of the extra information when making decisions to improve traffic efficiency and safety. The safe actor-critic algorithm we propose has two new techniques: the truncated Q-function and safe action mapping. The truncated Q-function utilizes the shared information from neighboring CAVs such that the joint state and action spaces of the Q-function do not grow in our algorithm for a large-scale CAV system. We prove the bound of the approximation error between the truncated-Q and global Q-functions. The safe action mapping provides a provable safety guarantee for both the training and execution based on control barrier functions. Using the CARLA simulator for experiments, we show that our approach can improve the CAV system's efficiency in terms of average velocity and comfort under different CAV ratios and different traffic densities. We also show that our approach avoids the execution of unsafe actions and always maintains a safe distance from other vehicles. We construct an obstacle-at-corner scenario to show that the shared vision can help CAVs to observe obstacles earlier and take action to avoid traffic jams.

下载PDF全文

下载文献需遵守相关版权规定

论文标题