因果关系驱动的层次结构发现，用于增强学习

论文标题

因果关系驱动的层次结构发现，用于增强学习

Causality-driven Hierarchical Structure Discovery for Reinforcement Learning

论文作者

Peng, Shaohui, Hu, Xing, Zhang, Rui, Tang, Ke, Guo, Jiaming, Yi, Qi, Chen, Ruizhi, Zhang, Xishan, Du, Zidong, Li, Ling, Guo, Qi, Chen, Yunji

论文摘要

分层增强学习（HRL）有效地提高了用稀疏奖励的代理商对任务的探索效率，并使用高质量的分层结构指南（例如，子观念或选项）。但是，如何自动发现高质量的分层结构仍然是一个巨大的挑战。以前的HRL方法通过利用随机性驱动的探索范式而几乎无法发现复杂环境中的层次结构。为了解决这个问题，我们提出了CDHRL，这是一个因果关系驱动的层次结构增强学习框架，利用因果关系驱动的发现，而不是以随机性为导向的探索来有效地在复杂的环境中建立高质量的分层结构。关键见解是，环境变量之间的因果关系自然适合建模可触及的子目标及其依赖性，并且可以完美地指导建立高质量的层次结构。在两个复杂环境（2d Minecraft和Eden）中的结果表明，CDHRL显着提高了因果关系驱动的范式的勘探效率。

Hierarchical reinforcement learning (HRL) effectively improves agents' exploration efficiency on tasks with sparse reward, with the guide of high-quality hierarchical structures (e.g., subgoals or options). However, how to automatically discover high-quality hierarchical structures is still a great challenge. Previous HRL methods can hardly discover the hierarchical structures in complex environments due to the low exploration efficiency by exploiting the randomness-driven exploration paradigm. To address this issue, we propose CDHRL, a causality-driven hierarchical reinforcement learning framework, leveraging a causality-driven discovery instead of a randomness-driven exploration to effectively build high-quality hierarchical structures in complicated environments. The key insight is that the causalities among environment variables are naturally fit for modeling reachable subgoals and their dependencies and can perfectly guide to build high-quality hierarchical structures. The results in two complex environments, 2D-Minecraft and Eden, show that CDHRL significantly boosts exploration efficiency with the causality-driven paradigm.

下载PDF全文

下载文献需遵守相关版权规定

论文标题