论文标题
基于模型的剩余策略学习以及用于天线控制的应用
Model Based Residual Policy Learning with Applications to Antenna Control
论文作者
论文摘要
非不同的控制器和基于规则的策略被广泛用于控制电信网络和机器人等真实系统。具体而言,这些策略可以动态配置移动网络基站天线的参数,以改善用户的覆盖范围和服务质量。在天线倾斜控制问题的激励下,我们引入了基于模型的剩余政策学习(MBRPL),这是一种实用的强化学习(RL)方法。 MBRPL通过基于模型的方法来增强现有策略,从而提高了样本效率,并且与现成的RL方法相比,与实际环境的相互作用减少了,这是我们所知的最佳知识,这是第一本研究基于模型的天线控制方法的论文。实验结果表明,我们的方法可以提供强大的初始性能,同时提高样本效率对以前的RL方法,这是将这些算法部署在实际网络中的一步。
Non-differentiable controllers and rule-based policies are widely used for controlling real systems such as telecommunication networks and robots. Specifically, parameters of mobile network base station antennas can be dynamically configured by these policies to improve users coverage and quality of service. Motivated by the antenna tilt control problem, we introduce Model-Based Residual Policy Learning (MBRPL), a practical reinforcement learning (RL) method. MBRPL enhances existing policies through a model-based approach, leading to improved sample efficiency and a decreased number of interactions with the actual environment when compared to off-the-shelf RL methods.To the best of our knowledge, this is the first paper that examines a model-based approach for antenna control. Experimental results reveal that our method delivers strong initial performance while improving sample efficiency over previous RL methods, which is one step towards deploying these algorithms in real networks.