论文标题
学习具有深度变异推理的离散状态抽象
Learning Discrete State Abstractions With Deep Variational Inference
论文作者
论文摘要
抽象对于在具有较大状态空间的域中有效的顺序决策至关重要。在这项工作中,我们提出了一种信息瓶颈方法,用于学习近似双象征,即一种状态抽象。我们使用深层神经编码器将状态映射到连续嵌入。我们使用动作条件的隐藏马尔可夫模型将这些嵌入到离散表示上,该模型是通过神经网络端到端训练的。我们的方法适合具有高维状态的环境,并从在马尔可夫决策过程中的代理商收集的经验流来学习。通过这个学到的离散抽象模型,我们可以在多进球的强化学习环境中有效地计划看不见的目标。我们在具有图像态的简化机器人操纵域中测试我们的方法。我们还将其与以前基于模型的方法进行比较,以在离散的网格世界样环境中找到双象征。源代码可从https://github.com/ondrejba/discrete_abstractions获得。
Abstraction is crucial for effective sequential decision making in domains with large state spaces. In this work, we propose an information bottleneck method for learning approximate bisimulations, a type of state abstraction. We use a deep neural encoder to map states onto continuous embeddings. We map these embeddings onto a discrete representation using an action-conditioned hidden Markov model, which is trained end-to-end with the neural network. Our method is suited for environments with high-dimensional states and learns from a stream of experience collected by an agent acting in a Markov decision process. Through this learned discrete abstract model, we can efficiently plan for unseen goals in a multi-goal Reinforcement Learning setting. We test our method in simplified robotic manipulation domains with image states. We also compare it against previous model-based approaches to finding bisimulations in discrete grid-world-like environments. Source code is available at https://github.com/ondrejba/discrete_abstractions.