论文标题
基于增强学习的强大策略设计,用于中继和DF继电器网络中的功率优化
Reinforcement Learning Based Robust Policy Design for Relay and Power Optimization in DF Relaying Networks
论文作者
论文摘要
在本文中,我们研究了具有反传递不确定性的解码和前向合作网络中的停电最小化问题。为了降低停电概率并提高服务质量,现有研究通常依赖于确切的瞬时通道状态信息(CSI)和环境不确定性的假设。但是,在频道状态迅速变化的实际情况下,很难立即获得完美的瞬时CSI,并且可能不会观察到通信环境的不确定性,这使得传统方法不适用。因此,我们求助于解决方案的增强学习方法,这些方法不需要任何基本渠道或环境不确定性假设的先验知识。 RL方法是从与通信环境的互动中学习,优化其行动策略,然后提出继电器选择和权力分配方案。当将RL方法应用于具有环境不确定性的通信方案时,我们首先通过给出最差案例性能的下限来分析RL动作政策的鲁棒性。然后,我们提出了一种基于RL的停电概率最小化的鲁棒算法。仿真结果表明,与传统的RL方法相比,我们的方法具有更好的概括能力,在看不见的环境中评估时,最差的案例性能可以提高约6%。
In this paper, we study the outage minimization problem in a decode-and-forward cooperative network with relay uncertainty. To reduce the outage probability and improve the quality of service, existing researches usually rely on the assumption of both exact instantaneous channel state information (CSI) and environmental uncertainty. However, it is difficult to obtain perfect instantaneous CSI immediately under practical situations where channel states change rapidly, and the uncertainty in communication environments may not be observed, which makes traditional methods not applicable. Therefore, we turn to reinforcement learning (RL) methods for solutions, which do not need any prior knowledge of underlying channel or assumptions of environmental uncertainty. RL method is to learn from the interaction with communication environment, optimize its action policy, and then propose relay selection and power allocation schemes. We first analyse the robustness of RL action policy by giving the lower bound of the worst-case performance, when RL methods are applied to communication scenarios with environment uncertainty. Then, we propose a robust algorithm for outage probability minimization based on RL. Simulation results reveal that compared with traditional RL methods, our approach has better generalization ability and can improve the worst-case performance by about 6% when evaluated in unseen environments.