论文标题

与通用公用事业的部分可观察到的离散时间折扣马尔可夫游戏

Partially Observable Discrete-time Discounted Markov Games with General Utility

论文作者

Bhabak, Arnab, saha, Subhamay

论文摘要

在本文中,我们调查了一个可观察到的零和游戏,其中状态流程是一个离散的时间马尔可夫链。我们在优化标准中考虑了通用效用函数。我们展示了有限和无限视野游戏的价值的存在,还建立了最佳策略的存在。主要步骤是将部分可观察到的游戏转换为一个完全可观察的游戏,这也可以跟踪总折扣的累积奖励/成本。

In this paper, we investigate a partially observable zero sum games where the state process is a discrete time Markov chain. We consider a general utility function in the optimization criterion. We show the existence of value for both finite and infinite horizon games and also establish the existence of optimal polices. The main step involves converting the partially observable game into a completely observable game which also keeps track of the total discounted accumulated reward/cost.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源