定性控制器的消费综合马尔可夫决策过程

论文标题

定性控制器的消费综合马尔可夫决策过程

Qualitative Controller Synthesis for Consumption Markov Decision Processes

论文作者

Blahoudek, František, Brázdil, Tomáš, Novotný, Petr, Ornik, Melkior, Thangeda, Pranay, Topcu, Ufuk

论文摘要

消费马尔可夫决策过程（CMDP）是资源受限系统的概率决策模型。在CMDP中，控制器拥有一定数量的关键资源，例如电力。控制器的每个动作都可以消耗一些资源。资源补充只有在特殊的重新加载州才有可能，其中资源级别可以重新加载到系统的全部容量。控制器的任务是防止资源耗尽，即确保资源的可用数量保持非负值，同时确保额外的线性时间属性。我们研究了消费MDP中策略合成的复杂性与几乎悬浮的Büchi目标。我们表明问题可以在多项式时间内解决。我们实施算法并表明它可以有效地求解CMDPS建模现实世界的场景。

Consumption Markov Decision Processes (CMDPs) are probabilistic decision-making models of resource-constrained systems. In a CMDP, the controller possesses a certain amount of a critical resource, such as electric power. Each action of the controller can consume some amount of the resource. Resource replenishment is only possible in special reload states, in which the resource level can be reloaded up to the full capacity of the system. The task of the controller is to prevent resource exhaustion, i.e. ensure that the available amount of the resource stays non-negative, while ensuring an additional linear-time property. We study the complexity of strategy synthesis in consumption MDPs with almost-sure Büchi objectives. We show that the problem can be solved in polynomial time. We implement our algorithm and show that it can efficiently solve CMDPs modelling real-world scenarios.

下载PDF全文

下载文献需遵守相关版权规定

论文标题