论文标题
非线性模型预测控制的政策学习,并应用于USV
Policy Learning for Nonlinear Model Predictive Control with Application to USVs
论文作者
论文摘要
非线性模型预测控制(NMPC)的不可承受的计算负载阻止了它被用于数十年的较高采样率的机器人。本文涉及具有系统限制的非线性MPC的政策学习问题及其在无人体表面车辆(USV)中的应用,在该车辆(USV)中,非线性MPC策略离线学习并在线部署以解决计算复杂性问题。提出了基于深神经网络(DNN)的策略学习MPC(PL-MPC)方法,以避免在线解决非线性最佳控制问题。开发了详细的策略学习方法,并设计了PL-MPC算法。提出了确保实施政策实施可行性的策略,从理论上证明,在拟议方法下的闭环系统在概率上是渐近稳定的。此外,我们将PL-MPC算法成功地应用于USV的运动控制。结果表明,拟议的算法可以以高精度运动控制的最高5美元Hz $的采样率实施。实验视频可通过:\ url {https://v.youku.com/v_show/id_xntkwmtm0nzm5ng==.html
The unaffordable computation load of nonlinear model predictive control (NMPC) has prevented it for being used in robots with high sampling rates for decades. This paper is concerned with the policy learning problem for nonlinear MPC with system constraints, and its applications to unmanned surface vehicles (USVs), where the nonlinear MPC policy is learned offline and deployed online to resolve the computational complexity issue. A deep neural networks (DNN) based policy learning MPC (PL-MPC) method is proposed to avoid solving nonlinear optimal control problems online. The detailed policy learning method is developed and the PL-MPC algorithm is designed. The strategy to ensure the practical feasibility of policy implementation is proposed, and it is theoretically proved that the closed-loop system under the proposed method is asymptotically stable in probability. In addition, we apply the PL-MPC algorithm successfully to the motion control of USVs. It is shown that the proposed algorithm can be implemented at a sampling rate up to $5 Hz$ with high-precision motion control. The experiment video is available via:\url{https://v.youku.com/v_show/id_XNTkwMTM0NzM5Ng==.html