最佳控制折扣成本无限马尔可夫马尔可夫决策过程

论文标题

最佳控制折扣成本无限马尔可夫马尔可夫决策过程

On Optimal Control of Discounted Cost Infinite-Horizon Markov Decision Processes Under Local State Information Structures

论文作者

Peng, Guanze, Kavitha, Veeraruna, Zhu, Qunayan

论文摘要

本文研究了与马尔可夫流程有关的一类最佳控制问题。决策者仅在多代理系统中经常遇到的状态向量信息的子集局部访问。在这种信息结构下，无法观察到一部分国家向量。我们利用从头算原则并找到一种新形式的钟手方程式来表征本地信息结构下控制问题的最佳策略。动态编程解决方案具有与可观察到的本地信息相关的动力学相关状态组件和本地状态反馈策略的混合。我们进一步使用线性编程方法来表征最佳的本地国家反馈策略。为了降低最佳策略的计算复杂性，我们提出了一种基于虚拟信念的近似算法，以找到次优政策。我们显示了亚最佳解决方案的性能界限，并通过数值案例研究证实了结果。

This paper investigates a class of optimal control problems associated with Markov processes with local state information. The decision-maker has only local access to a subset of a state vector information as often encountered in decentralized control problems in multi-agent systems. Under this information structure, part of the state vector cannot be observed. We leverage ab initio principles and find a new form of Bellman equations to characterize the optimal policies of the control problem under local information structures. The dynamic programming solutions feature a mixture of dynamics associated unobservable state components and the local state-feedback policy based on the observable local information. We further characterize the optimal local-state feedback policy using linear programming methods. To reduce the computational complexity of the optimal policy, we propose an approximate algorithm based on virtual beliefs to find a sub-optimal policy. We show the performance bounds on the sub-optimal solution and corroborate the results with numerical case studies.

下载PDF全文

下载文献需遵守相关版权规定

论文标题