具有智能反射器的毫米波通信：性能优化和分配增强学习

论文标题

具有智能反射器的毫米波通信：性能优化和分配增强学习

Millimeter Wave Communications with an Intelligent Reflector: Performance Optimization and Distributional Reinforcement Learning

论文作者

Zhang, Qianqian, Saad, Walid, Bennis, Mehdi

论文摘要

在本文中，提出了一个新颖的框架，以优化毫米波基站的下行链路多用户通信，该框架由可重构的智能反射器（IR）提供了帮助。特别是，开发了一种通道估计方法来实时测量通道状态信息（CSI）。首先，对于完美的CSI方案，通过迭代方法共同优化了BS的预编码和IR反射系数，以最大程度地提高对多个用户的下行链接率的总和。接下来，在不完善的CSI场景中，提出了分配加固学习（DRL）方法来学习最佳的IR反射并最大程度地提高对下行链路容量的期望。为了对传输速率的概率分布进行建模，开发了基于分位数回归（QR）的学习算法，并证明提出的QR-DRL方法被证明会收敛到稳定的下行链路传输速率的分布。仿真结果表明，与固定的IR反射方案和直接传输方案相比，在无错误的CSI方案中，所提出的方法的下行链路总和率增加了30％和2倍。仿真结果还表明，通过部署更多的IR元素，可以显着改善下行链路总和。但是，随着IR组件的数量增加，通道估计需要更多的时间，并且IR与IR辅助传输速率的增加将变得较小。此外，在有限的CSI知识中，模拟结果表明，与Q-Learning基线相比，在线链接速率完全分布的拟议的QR-DRL方法可以使下行链接速率完全分布，并将下链路速率提高10％。

In this paper, a novel framework is proposed to optimize the downlink multi-user communication of a millimeter wave base station, which is assisted by a reconfigurable intelligent reflector (IR). In particular, a channel estimation approach is developed to measure the channel state information (CSI) in real-time. First, for a perfect CSI scenario, the precoding transmission of the BS and the reflection coefficient of the IR are jointly optimized, via an iterative approach, so as to maximize the sum of downlink rates towards multiple users. Next, in the imperfect CSI scenario, a distributional reinforcement learning (DRL) approach is proposed to learn the optimal IR reflection and maximize the expectation of downlink capacity. In order to model the transmission rate's probability distribution, a learning algorithm, based on quantile regression (QR), is developed, and the proposed QR-DRL method is proved to converge to a stable distribution of downlink transmission rate. Simulation results show that, in the error-free CSI scenario, the proposed approach yields over 30% and 2-fold increase in the downlink sum-rate, compared with a fixed IR reflection scheme and direct transmission scheme, respectively. Simulation results also show that by deploying more IR elements, the downlink sum-rate can be significantly improved. However, as the number of IR components increases, more time is required for channel estimation, and the slope of increase in the IR-aided transmission rate will become smaller. Furthermore, under limited knowledge of CSI, simulation results show that the proposed QR-DRL method, which learns a full distribution of the downlink rate, yields a better prediction accuracy and improves the downlink rate by 10% for online deployments, compared with a Q-learning baseline.

下载PDF全文

下载文献需遵守相关版权规定

论文标题