合作互联车辆的多代理深钢筋学习

论文标题

合作互联车辆的多代理深钢筋学习

Multi-Agent Deep Reinforcement Learning for Cooperative Connected Vehicles

论文作者

Kwon, Dohyun, Kim, Joongheon

论文摘要

毫米波（MMWave）基站可以向连接的车辆提供丰富的高容量渠道资源，因此可以高度改进其下行链路吞吐量的服务质量（QoS）。 MMWave基站可以在它们之间的非重叠通道上的现有基站（例如宏电池基站）之间运行，并且车辆可以决定要关联的基站，以及在异质网络上使用哪种渠道。此外，由于MMWave通信的非OMNI特性，车辆决定如何将光束方向与MMWave基站保持一致。但是，这种关节问题需要高计算成本，这是NP螺态并且具有组合特征。在本文中，我们在三层异质车辆网络（HETVNET）中解决了该问题，并以多代理的深入增强学习（DRL）来最大程度地提高预期的总奖励（即，即下行链路吞吐量）。引入了多代理深层确定性政策梯度（MADDPG）方法，以实现连续行动领域的最佳政策。

Millimeter-wave (mmWave) base station can offer abundant high capacity channel resources toward connected vehicles so that quality-of-service (QoS) of them in terms of downlink throughput can be highly improved. The mmWave base station can operate among existing base stations (e.g., macro-cell base station) on non-overlapped channels among them and the vehicles can make decision what base station to associate, and what channel to utilize on heterogeneous networks. Furthermore, because of the non-omni property of mmWave communication, the vehicles decide how to align the beam direction toward mmWave base station to associate with it. However, such joint problem requires high computational cost, which is NP-hard and has combinatorial features. In this paper, we solve the problem in 3-tier heterogeneous vehicular network (HetVNet) with multi-agent deep reinforcement learning (DRL) in a way that maximizes expected total reward (i.e., downlink throughput) of vehicles. The multi-agent deep deterministic policy gradient (MADDPG) approach is introduced to achieve optimal policy in continuous action domain.

下载PDF全文

下载文献需遵守相关版权规定

论文标题