粘度解决方案方法的有限 - 摩尼子连续时间马尔可夫决策过程

论文标题

粘度解决方案方法的有限 - 摩尼子连续时间马尔可夫决策过程

Viscosity solutions approach to finite-horizon continuous-time Markov decision process

论文作者

Liao, Zhong-Wei, Shao, Jinghai

论文摘要

本文调查了有限摩尼子连续时间马尔可夫决策过程的最佳控制问题，并具有延迟依赖性控制策略。我们在决策过程中开发紧凑方法，并表明最佳政策的存在。随后，通过在离散空间的设置中通过延迟依赖性控制策略的动态编程原理汉密尔顿 - 雅各比 - 贝尔曼（HJB）方程。在某些条件下，我们给出了比较原理，并进一步证明了该值函数是该HJB方程的独特粘度解决方案。基于此，我们表明，在延迟依赖的控制策略中，有一个最佳的马尔可夫人。

This paper investigates the optimal control problems for the finite-horizon continuous-time Markov decision processes with delay-dependent control policies. We develop compactification methods in decision processes, and show that the existence of optimal policies. Subsequently, through the dynamic programming principle of the delay-dependent control policies, the differential-difference Hamilton-Jacobi-Bellman (HJB) equation in the setting of discrete space was established. Under certain conditions, we give the comparison principle and further prove that the value function is the unique viscosity solution to this HJB equation. Based on this, we show that among the class of delay-dependent control policies, there is an optimal one which is Markovian.

下载PDF全文

下载文献需遵守相关版权规定

论文标题