通过政策迭代进行固定马尔可夫链的最佳运输

论文标题

通过政策迭代进行固定马尔可夫链的最佳运输

Optimal Transport for Stationary Markov Chains via Policy Iteration

论文作者

O'Connor, Kevin, McGoff, Kevin, Nobel, Andrew B.

论文摘要

我们研究了一对固定有限国家马尔可夫链的最佳运输问题，重点是最佳过渡耦合的计算。过渡耦合是捕捉马尔可夫连锁店动态的受约束的运输计划家族。最佳过渡耦合（OTC）问题的解决方案对应于最小化长期平均成本的两个链的比对。我们在OTC问题和Markov决策过程之间建立了联系，并表明可以通过改编政策迭代来获得OTC问题的解决方案。对于具有较大状态空间的设置，我们基于OTC问题的熵调查版本开发快速近似算法，并为其触及率复杂性提供界限。我们为正则化和未注册算法建立了稳定性结果，从中，统计一致性结果是必然的。我们通过一项模拟研究从经验上验证我们的理论结果，表明近似算法表现出更快的总运行时，并且误差较低。最后，我们将方法的设置和应用扩展到隐藏的马尔可夫模型，并说明了在实践中使用计算机生成音乐的应用中所提出的算法的潜在使用。

We study the optimal transport problem for pairs of stationary finite-state Markov chains, with an emphasis on the computation of optimal transition couplings. Transition couplings are a constrained family of transport plans that capture the dynamics of Markov chains. Solutions of the optimal transition coupling (OTC) problem correspond to alignments of the two chains that minimize long-term average cost. We establish a connection between the OTC problem and Markov decision processes, and show that solutions of the OTC problem can be obtained via an adaptation of policy iteration. For settings with large state spaces, we develop a fast approximate algorithm based on an entropy-regularized version of the OTC problem, and provide bounds on its per-iteration complexity. We establish a stability result for both the regularized and unregularized algorithms, from which a statistical consistency result follows as a corollary. We validate our theoretical results empirically through a simulation study, demonstrating that the approximate algorithm exhibits faster overall runtime with low error. Finally, we extend the setting and application of our methods to hidden Markov models, and illustrate the potential use of the proposed algorithms in practice with an application to computer-generated music.

下载PDF全文

下载文献需遵守相关版权规定

论文标题