连续空间上离散时间非线性系统的Kullback-Leibler控制

论文标题

连续空间上离散时间非线性系统的Kullback-Leibler控制

Kullback-Leibler control for discrete-time nonlinear systems on continuous spaces

论文作者

Ito, Kaito, Kashima, Kenji

论文摘要

Kullback-Leibler（KL）控制可以为非线性最佳控制问题提供有效的数值方法。 KL控制的关键假设是过渡分布的完全可控性。但是，当动力学在连续空间中演变时，通常会违反此假设。因此，将KL控件应用于连续空间的问题需要一些近似，这导致最佳性失去。为了避免这种近似值，在本文中，我们重新制定了连续空间的KL控制问题，以便它不需要不现实的假设。原始KL控制和重新配置的KL控制之间的关键区别在于，前者衡量了受控和不受控制的过渡分布之间KL差异的控制工作，而后者则代替了通过噪声驱动的过渡代替了不受控制的过渡。我们表明，重新制定的KL控制允许像原始算法这样的有效的数值算法，而没有不合理的假设。具体而言，可以通过基于其路径积分表示的蒙特卡洛方法来计算关联的值函数。

Kullback-Leibler (KL) control enables efficient numerical methods for nonlinear optimal control problems. The crucial assumption of KL control is the full controllability of the transition distribution. However, this assumption is often violated when the dynamics evolves in a continuous space. Consequently, applying KL control to problems with continuous spaces requires some approximation, which leads to the lost of the optimality. To avoid such approximation, in this paper, we reformulate the KL control problem for continuous spaces so that it does not require unrealistic assumptions. The key difference between the original and reformulated KL control is that the former measures the control effort by KL divergence between controlled and uncontrolled transition distributions while the latter replaces the uncontrolled transition by a noise-driven transition. We show that the reformulated KL control admits efficient numerical algorithms like the original one without unreasonable assumptions. Specifically, the associated value function can be computed by using a Monte Carlo method based on its path integral representation.

下载PDF全文

下载文献需遵守相关版权规定

论文标题