论文标题

在没有奖励的情况下,期权发现与多种分析

Option Discovery in the Absence of Rewards with Manifold Analysis

论文作者

Bar, Amitay, Talmon, Ronen, Meir, Ron

论文摘要

选项已被证明是增强学习,促进改进的探索和学习的有效工具。在本文中,我们提出了一种基于光谱图理论的方法,并得出了一种系统地发现选项而无需访问特定奖励或任务分配的算法。与以前方法中使用的常见实践相反,我们的算法充分利用了图形laplacian的光谱。合并与高图频率相关的模式分开了域的微妙,这些模式可用于选项发现有用。使用基于几何和基于多种的分析,我们提供了算法的理论理由。此外,我们还展示了其在几个领域的性能,与竞争方法相比表明了明显的改进。

Options have been shown to be an effective tool in reinforcement learning, facilitating improved exploration and learning. In this paper, we present an approach based on spectral graph theory and derive an algorithm that systematically discovers options without access to a specific reward or task assignment. As opposed to the common practice used in previous methods, our algorithm makes full use of the spectrum of the graph Laplacian. Incorporating modes associated with higher graph frequencies unravels domain subtleties, which are shown to be useful for option discovery. Using geometric and manifold-based analysis, we present a theoretical justification for the algorithm. In addition, we showcase its performance in several domains, demonstrating clear improvements compared to competing methods.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源