$ω$ - 局限性属性的满足性范围在有限参数马尔可夫决策过程中

论文标题

$ω$ - 局限性属性的满足性范围在有限参数马尔可夫决策过程中

Satisfiability Bounds for $ω$-Regular Properties in Bounded-Parameter Markov Decision Processes

论文作者

Křetínský, Jan, Meggendorfer, Tobias, Weininger, Maximilian

论文摘要

我们考虑计算在有限的参数马尔可夫决策过程（BMDP）中满足$ω$等级属性的最低和最大概率的问题。 BMDP是由Markov决策过程（MDP）引起的，该过程允许以未知的实际概率的间隔形式出现过渡概率的不确定性。 $ω$ - 惯性语言形成了大类属性，例如Rabin或Parity Automata，涵盖了诸如线性时间逻辑之类的丰富规格。在BMDP中，满足该属性的可能性取决于未知的过渡概率以及策略。在本文中，我们计算极端值。这解决了Dutreix和Coogan在CDC 2018中特别提出的问题，并在没有对手的情况下扩展了他们的结果马尔可夫链。主要思想是将其工作重新诠释为间隔MDP的分析以及BMDP问题，作为对$ω$常规随机游戏的分析，并提供了解决方案。该方法可以顺利扩展到有限参数随机游戏。

We consider the problem of computing minimum and maximum probabilities of satisfying an $ω$-regular property in a bounded-parameter Markov decision process (BMDP). BMDP arise from Markov decision processes (MDP) by allowing for uncertainty on the transition probabilities in the form of intervals where the actual probabilities are unknown. $ω$-regular languages form a large class of properties, expressible as, e.g., Rabin or parity automata, encompassing rich specifications such as linear temporal logic. In a BMDP the probability to satisfy the property depends on the unknown transitions probabilities as well as on the policy. In this paper, we compute the extreme values. This solves the problem specifically suggested by Dutreix and Coogan in CDC 2018, extending their results on interval Markov chains with no adversary. The main idea is to reinterpret their work as analysis of interval MDP and accordingly the BMDP problem as analysis of an $ω$-regular stochastic game, where a solution is provided. This method extends smoothly further to bounded-parameter stochastic games.

下载PDF全文

下载文献需遵守相关版权规定

论文标题