简要介绍视觉元强化学习中的概括

论文标题

简要介绍视觉元强化学习中的概括

A Brief Look at Generalization in Visual Meta-Reinforcement Learning

论文作者

Alver, Safa, Precup, Doina

论文摘要

由于意识到对高维任务进行培训的深度强化学习算法可以极大地过度地适合其训练环境，因此有几项研究研究了这些算法的概括性能。但是，没有类似的研究评估专门为概括而设计的算法的泛化性能，即元强化学习算法。在本文中，我们通过利用高维，程序生成的环境来评估这些算法的概括性能。我们发现，当这些算法对具有挑战性的任务进行评估时，它们可以表现出强大的过度拟合。我们还观察到，在许多当前的元强化学习算法中，对具有稀疏奖励的高维任务的可扩展性仍然是一个重大问题。通过这些结果，我们强调了需要开发既可以概括又扩展的元强化学习算法的必要性。

Due to the realization that deep reinforcement learning algorithms trained on high-dimensional tasks can strongly overfit to their training environments, there have been several studies that investigated the generalization performance of these algorithms. However, there has been no similar study that evaluated the generalization performance of algorithms that were specifically designed for generalization, i.e. meta-reinforcement learning algorithms. In this paper, we assess the generalization performance of these algorithms by leveraging high-dimensional, procedurally generated environments. We find that these algorithms can display strong overfitting when they are evaluated on challenging tasks. We also observe that scalability to high-dimensional tasks with sparse rewards remains a significant problem among many of the current meta-reinforcement learning algorithms. With these results, we highlight the need for developing meta-reinforcement learning algorithms that can both generalize and scale.

下载PDF全文

下载文献需遵守相关版权规定

论文标题