论文标题

强大的$ Q $ - 瓦斯坦不确定性下的马尔可夫决策过程的学习算法

Robust $Q$-learning Algorithm for Markov Decision Processes under Wasserstein Uncertainty

论文作者

Neufeld, Ariel, Sester, Julian

论文摘要

我们提出了一种新颖的$ Q $ - 学习算法,该算法是针对解决分布的马尔可夫决策问题而定制的,其中基础马尔可夫决策过程的相应歧义性过渡概率集是围绕(可能估计的)参考度量的Wasserstein Ball。我们证明了所提出的算法的融合,并提供了几个示例,还使用实际数据来说明我们算法的障碍性,以及在解决随机最佳控制问题时考虑分布鲁棒性的好处,特别是当估计的分布在实践中被弄错了。

We present a novel $Q$-learning algorithm tailored to solve distributionally robust Markov decision problems where the corresponding ambiguity set of transition probabilities for the underlying Markov decision process is a Wasserstein ball around a (possibly estimated) reference measure. We prove convergence of the presented algorithm and provide several examples also using real data to illustrate both the tractability of our algorithm as well as the benefits of considering distributional robustness when solving stochastic optimal control problems, in particular when the estimated distributions turn out to be misspecified in practice.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源