具有LTL规格的POMDP的随机有限状态控制

论文标题

具有LTL规格的POMDP的随机有限状态控制

Stochastic Finite State Control of POMDPs with LTL Specifications

论文作者

Ahmadi, Mohamadreza, Sharan, Rangoli, Burdick, Joel W.

论文摘要

部分可观察到的马尔可夫决策过程（POMDP）为不确定性和不完美感应的自主决策提供了建模框架，例如机器人操纵和自动驾驶汽车。然而，众所周知，对POMDP的最佳控制是可以棘手的。本文考虑了合成POMDP的亚最佳随机有限状态控制器（SFSC）的定量问题，以至于最大化满足一组高级规格的概率，以最大化线性时间逻辑（LTL）公式。我们首先将后一种问题投入到优化中，并根据泊松方程和麦考密克信封使用放松。然后，我们提出了一种随机有限的策略迭代算法，从而导致SFSC大小和任何时间算法的受控增长，其中控制器的性能通过连续的迭代来改善，但可以根据时间或内存考虑用户停止。我们通过机器人导航案例研究说明了提出的方法。

Partially observable Markov decision processes (POMDPs) provide a modeling framework for autonomous decision making under uncertainty and imperfect sensing, e.g. robot manipulation and self-driving cars. However, optimal control of POMDPs is notoriously intractable. This paper considers the quantitative problem of synthesizing sub-optimal stochastic finite state controllers (sFSCs) for POMDPs such that the probability of satisfying a set of high-level specifications in terms of linear temporal logic (LTL) formulae is maximized. We begin by casting the latter problem into an optimization and use relaxations based on the Poisson equation and McCormick envelopes. Then, we propose an stochastic bounded policy iteration algorithm, leading to a controlled growth in sFSC size and an any time algorithm, where the performance of the controller improves with successive iterations, but can be stopped by the user based on time or memory considerations. We illustrate the proposed method by a robot navigation case study.

下载PDF全文

下载文献需遵守相关版权规定

论文标题