通过部分观察和执行延迟对非平滑动力学的随机冲动控制：应用于环境恢复问题

论文标题

通过部分观察和执行延迟对非平滑动力学的随机冲动控制：应用于环境恢复问题

Stochastic impulse control of non-smooth dynamics with partial observation and execution delay: application to an environmental restoration problem

论文作者

Yoshioka, Hidekazu, Yaegashi, Yuta

论文摘要

由随机干扰驱动的非平滑动力学在各种工程问题中出现。冲动性干预通常用于控制随机系统。但是，探索了执行延迟的模型和分析。另外，不断接收动态信息是不可能的。在本文中，随着环境恢复问题的应用，在离散和随机观察结果下，由执行延迟的连续时间随机冲动控制问题得到了新的形式和分析。动力学具有由马尔可夫链调节的非平滑系数，并最终由于非平滑度而达到了不良状态，例如耗竭。控制问题的目的是找到最具成本效益的政策，以防止动态达到不良状态。我们证明，找到最佳策略可以减少解决非标准椭圆方程的非标准系统，即最佳方程，在简化的情况下，该方程经过严格和分析验证。也明确得出并明确地得出了用于控制动力学的相关的Fokker-Planck方程。该模型最终应用于最近的河流环境恢复问题的数值计算。最优性和fokker-planck方程已成功计算，并且在数值上获得了最佳策略和概率密度函数。讨论了执行延迟的影响以更深入地分析模型。

Non-smooth dynamics driven by stochastic disturbance arise in a wide variety of engineering problems. Impulsive interventions are often employed to control stochastic systems; however, the modeling and analysis subject to execution delay have been less explored. In addition, continuously receiving information of the dynamics is not always possible. In this paper, with an application to an environmental restoration problem, a continuous-time stochastic impulse control problem subject to execution delay under discrete and random observations is newly formulated and analyzed. The dynamics have a non-smooth coefficient modulated by a Markov chain, and eventually attain an undesirable state like a depletion due to the non-smoothness. The goal of the control problem is to find the most cost-efficient policy that can prevent the dynamics from attaining the undesirable state. We demonstrate that finding the optimal policy reduces to solving a non-standard system of degenerate elliptic equations, the optimality equation, which is rigorously and analytically verified in a simplified case. The associated Fokker-Planck equation for the controlled dynamics is derived and solved explicitly as well. The model is finally applied to numerical computation of a recent river environmental restoration problem. The optimality and Fokker-Planck equations are successfully computed, and the optimal policy and the probability density functions are numerically obtained. The impacts of execution delay are discussed to deeper analyze the model.

下载PDF全文

下载文献需遵守相关版权规定

论文标题