论文标题
部分可观测时空混沌系统的无模型预测
Pervasive Machine Learning for Smart Radio Environments Enabled by Reconfigurable Intelligent Surfaces
论文作者
论文摘要
可重新配置的智能表面(RISS)的新兴技术被作为智能无线环境的推动力,提供了高度可扩展,低成本,硬件效率和几乎能量中立的解决方案,以动态控制无线电图的电磁信号传播,以实现无线媒介,最终为环境智能提供了更多的环境操作。在此类可重新配置的无线电环境中,RISS的既定密集部署的主要挑战之一是对有限的多个元信息的配置有限,甚至没有计算硬件。在本文中,我们考虑了多用户和多RIS授权的无线系统,并对在线机器学习方法进行了彻底的调查,以进行各种可调组件的编排。为了将最大化作为代表性设计目标的总和最大化,我们提出了基于深入增强学习(DRL)的全面问题制定。我们详细介绍了无线系统参数和DRL术语之间的对应关系,并为人工神经网络培训和部署设计了通用算法步骤,同时讨论了他们的实现细节。提出了第六代(6G)时代的多RIS授权无线通信的进一步实际考虑,并提出了一些关键的开放研究挑战。与基于DRL的状态不同,我们利用了系统设计参数的配置与无线环境的未来状态之间的独立性,并目前有效的多臂匪徒接近方法,其所得的总和率表现在数值上显示以超过随机配置,却超过了传统的Q-NetWork(DQN),但与降低了Qunstrution(DQN)Algorith,但较低了。
The emerging technology of Reconfigurable Intelligent Surfaces (RISs) is provisioned as an enabler of smart wireless environments, offering a highly scalable, low-cost, hardware-efficient, and almost energy-neutral solution for dynamic control of the propagation of electromagnetic signals over the wireless medium, ultimately providing increased environmental intelligence for diverse operation objectives. One of the major challenges with the envisioned dense deployment of RISs in such reconfigurable radio environments is the efficient configuration of multiple metasurfaces with limited, or even the absence of, computing hardware. In this paper, we consider multi-user and multi-RIS-empowered wireless systems, and present a thorough survey of the online machine learning approaches for the orchestration of their various tunable components. Focusing on the sum-rate maximization as a representative design objective, we present a comprehensive problem formulation based on Deep Reinforcement Learning (DRL). We detail the correspondences among the parameters of the wireless system and the DRL terminology, and devise generic algorithmic steps for the artificial neural network training and deployment, while discussing their implementation details. Further practical considerations for multi-RIS-empowered wireless communications in the sixth Generation (6G) era are presented along with some key open research challenges. Differently from the DRL-based status quo, we leverage the independence between the configuration of the system design parameters and the future states of the wireless environment, and present efficient multi-armed bandits approaches, whose resulting sum-rate performances are numerically shown to outperform random configurations, while being sufficiently close to the conventional Deep Q-Network (DQN) algorithm, but with lower implementation complexity.