论文标题
频谱共享雷达的增强学习技术的实验分析
Experimental Analysis of Reinforcement Learning Techniques for Spectrum Sharing Radar
论文作者
论文摘要
在这项工作中,我们首先描述了将加固学习(RL)控制在雷达系统中运行的框架。然后,我们通过讨论在商业现成(COTS)硬件上进行的实验来比较多种RL算法的实用性。根据收敛性,在拥挤的光谱环境中实现的雷达检测性能以及与不合作通信系统共享100MHz光谱的能力,对每种RL技术进行评估。我们检查了策略迭代,该迭代通过直接求解环境状态和雷达波形之间的随机映射以及深度RL技术来求解作为马尔可夫决策过程(MDP)的环境,该环境利用Q-学习形式的深度RL技术来近似于近似radar使用的参数化功能来选择最佳动作。我们表明,RL技术比感官和避免的方案是有益的,并讨论了每种方法最有效的条件。
In this work, we first describe a framework for the application of Reinforcement Learning (RL) control to a radar system that operates in a congested spectral setting. We then compare the utility of several RL algorithms through a discussion of experiments performed on Commercial off-the-shelf (COTS) hardware. Each RL technique is evaluated in terms of convergence, radar detection performance achieved in a congested spectral environment, and the ability to share 100MHz spectrum with an uncooperative communications system. We examine policy iteration, which solves an environment posed as a Markov Decision Process (MDP) by directly solving for a stochastic mapping between environmental states and radar waveforms, as well as Deep RL techniques, which utilize a form of Q-Learning to approximate a parameterized function that is used by the radar to select optimal actions. We show that RL techniques are beneficial over a Sense-and-Avoid (SAA) scheme and discuss the conditions under which each approach is most effective.