关于非合作完美信息半马尔可夫游戏

论文标题

关于非合作完美信息半马尔可夫游戏

On Non-Cooperative Perfect Information Semi-Markov Games

论文作者

Bakshi, K. G., Sinha, S.

论文摘要

我们表明，在限制性比率下，N-Per-n-perion非合作半马尔可夫游戏的收益具有纯净的半平稳纳什平衡。在较早的论文中，已经处理了零和两个人案件。证明是将这种完美的信息游戏减少到相关的半马尔可夫决策过程（SMDP），然后使用SMDP理论的存在结果。利用此减少程序，可以获得以下简单证明：（a）零和两个人的完美信息随机（马尔可夫）游戏都有价值和纯粹的固定最佳策略，可为折现的玩家以及未验证的报酬标准。（b）N-Per-n-poperative Perfect Information随机游戏也可以得出类似的结论。所有此类游戏都可以使用降低的SMDP（随机游戏的MDP）的任何有效算法来解决。在本文中，我们已经实施了Mondal的算法来解决SMDP在限制比率平均收益标准下。

We show that an N-person non-cooperative semi-Markov game under limiting ratio average pay-off has a pure semi-stationary Nash equilibrium. In an earlier paper, the zero-sum two person case has been dealt with. The proof follows by reducing such perfect information games to an associated semi-Markov decision process (SMDP) and then using existence results from the theory of SMDP. Exploiting this reduction procedure, one gets simple proofs of the following: (a) zero-sum two person perfect information stochastic (Markov) games have a value and pure stationary optimal strategies for both the players under discounted as well as undiscounted pay-off criteria. (b) Similar conclusions hold for N-person non-cooperative perfect information stochastic games as well. All such games can be solved using any efficient algorithm for the reduced SMDP (MDP for the case of Stochastic games). In this paper we have implemented Mondal's algorithm to solve an SMDP under limiting ratio average pay-off criterion.

下载PDF全文

下载文献需遵守相关版权规定

论文标题