基于MHE MPC的不确定多重LPV系统的政策梯度增强学习

论文标题

基于MHE MPC的不确定多重LPV系统的政策梯度增强学习

Policy Gradient Reinforcement Learning for Uncertain Polytopic LPV Systems based on MHE-MPC

论文作者

Esfahani, Hossein Nejatbakhsh, Gros, Sebastien

论文摘要

在本文中，我们提出了一种基于学习的模型预测控制（MPC）方法，该方法针对具有不精确调度参数（作为具有不精确范围的外源信号）的多物质线性参数变化（LPV）系统，其中线性时间不变（LTI）模型（LTI）模型（Vertices）捕获的捕获是通过计划参数的组合。我们首先建议采用移动范围估计（MHE）方案，以根据观测值和模型匹配误差同时估计凸组合向量和未测量的状态。为了解决MPC和MHE方案中使用的错误的LTI模型，然后我们采用策略梯度（PG）强化学习（RL）来学习估算器（MHE）和控制器（MPC），以便实现最佳的闭环性能。使用说明性示例证明了拟议的基于RL的MHE/MPC设计的有效性。

In this paper, we propose a learning-based Model Predictive Control (MPC) approach for the polytopic Linear Parameter-Varying (LPV) systems with inexact scheduling parameters (as exogenous signals with inexact bounds), where the Linear Time Invariant (LTI) models (vertices) captured by combinations of the scheduling parameters becomes wrong. We first propose to adopt a Moving Horizon Estimation (MHE) scheme to simultaneously estimate the convex combination vector and unmeasured states based on the observations and model matching error. To tackle the wrong LTI models used in both the MPC and MHE schemes, we then adopt a Policy Gradient (PG) Reinforcement Learning (RL) to learn both the estimator (MHE) and controller (MPC) so that the best closed-loop performance is achieved. The effectiveness of the proposed RL-based MHE/MPC design is demonstrated using an illustrative example.

下载PDF全文

下载文献需遵守相关版权规定

论文标题