采用深厚的加强学习方法部分连接的自动化车辆合作控制策略

论文标题

采用深厚的加强学习方法部分连接的自动化车辆合作控制策略

Partially Connected Automated Vehicle Cooperative Control Strategy with a Deep Reinforcement Learning Approach

论文作者

Shi, Haotian, Zhou, Yang, Wu, Keshu, Wang, Xin, Lin, Yangxin, Ran, Bin

论文摘要

本文提出了一种基于深度强化学习（DRL）算法的部分连接和自动化的交通环境的连接和自动化车辆（CAV）纵向控制的合作策略（CAVS），从而增强了混合交通，遵循效率和能量效率的混合交通稳定性。由于混合流量的序列是合并性的，以减少训练维度并减轻通信负担，因此我们将混合流量分解为多个子系统，在这些子系统中，每个子系统都由人类驱动的车辆（HDV）组成，然后是合作骑士。基于此，基于深度强化学习算法制定了合作的CAV控制策略，使骑士能够学习领先的HDV特征，并合作地做出纵向控制决策，以在本地提高每个子系统的性能，从而提高整个混合交通流的性能。对于培训，应用了分布式近端政策优化，以确保拟议的DRL的培训融合。为了验证所提出方法的有效性，进行了模拟实验，这表明我们提出的模型的性能具有极大的概括能力，可以在不同的渗透率和各种领先的HDV行为下有效地降低汽车的振荡，履行汽车的跟随和节能任务。关键字：部分连接的自动交通环境，合作控制，深入强化学习，交通振荡，能源效率

This paper proposes a cooperative strategy of connected and automated vehicles (CAVs) longitudinal control for partially connected and automated traffic environment based on deep reinforcement learning (DRL) algorithm, which enhances the string stability of mixed traffic, car following efficiency, and energy efficiency. Since the sequences of mixed traffic are combinatory, to reduce the training dimension and alleviate communication burdens, we decomposed mixed traffic into multiple subsystems where each subsystem is comprised of human-driven vehicles (HDV) followed by cooperative CAVs. Based on that, a cooperative CAV control strategy is developed based on a deep reinforcement learning algorithm, enabling CAVs to learn the leading HDV's characteristics and make longitudinal control decisions cooperatively to improve the performance of each subsystem locally and consequently enhance performance for the whole mixed traffic flow. For training, a distributed proximal policy optimization is applied to ensure the training convergence of the proposed DRL. To verify the effectiveness of the proposed method, simulated experiments are conducted, which shows the performance of our proposed model has a great generalization capability of dampening oscillations, fulfilling the car following and energy-saving tasks efficiently under different penetration rates and various leading HDVs behaviors. Keywords: partially connected automated traffic environment, cooperative control, deep reinforcement learning, traffic oscillation dampening, energy efficiency

下载PDF全文

下载文献需遵守相关版权规定

论文标题