提高输入维度可以改善深度增强学习吗？

论文标题

提高输入维度可以改善深度增强学习吗？

Can Increasing Input Dimensionality Improve Deep Reinforcement Learning?

论文作者

Ota, Kei, Oiki, Tomoaki, Jha, Devesh K., Mariyama, Toshisada, Nikovski, Daniel

论文摘要

深度强化学习（RL）算法最近在各种顺序决策任务中取得了巨大的成功，利用了培训大型深网的方法的进步。但是，这些方法通常需要大量的培训数据，这对于实际应用程序通常是一个大问题。一个自然的问题是，学习对国家的良好表示形式并使用较大的网络有助于学习更好的政策。在本文中，我们尝试研究增加的输入维度是否有助于提高无模型深度RL算法的性能和样本效率。为此，我们提出了一个在线功能提取器网络（OFENET），该网络使用神经网以产生良好的表示形式，以用作深度RL算法的输入。即使通常应该将高维度的高维度提高到RL的学习变得更加困难，但我们表明，与较低维度的观测值相比，RL药物实际上更有效地学习了RL代理。我们认为，更强大的特征传播以及较大的网络（因此较大的搜索空间）使RL代理可以学习更复杂的状态功能，从而提高样本效率。通过数值实验，我们表明，就样本效率和性能而言，所提出的方法优于其他几种最先进的算法。该方法的代码可在http://www.merl.com/research/license/ofenet上获得。

Deep reinforcement learning (RL) algorithms have recently achieved remarkable successes in various sequential decision making tasks, leveraging advances in methods for training large deep networks. However, these methods usually require large amounts of training data, which is often a big problem for real-world applications. One natural question to ask is whether learning good representations for states and using larger networks helps in learning better policies. In this paper, we try to study if increasing input dimensionality helps improve performance and sample efficiency of model-free deep RL algorithms. To do so, we propose an online feature extractor network (OFENet) that uses neural nets to produce good representations to be used as inputs to deep RL algorithms. Even though the high dimensionality of input is usually supposed to make learning of RL agents more difficult, we show that the RL agents in fact learn more efficiently with the high-dimensional representation than with the lower-dimensional state observations. We believe that stronger feature propagation together with larger networks (and thus larger search space) allows RL agents to learn more complex functions of states and thus improves the sample efficiency. Through numerical experiments, we show that the proposed method outperforms several other state-of-the-art algorithms in terms of both sample efficiency and performance. Codes for the proposed method are available at http://www.merl.com/research/license/OFENet .

下载PDF全文

下载文献需遵守相关版权规定

论文标题