论文标题
基于自主联络的深度剩余增强学习
Deep Residual Reinforcement Learning based Autonomous Blimp Control
论文作者
论文摘要
触发非常适合执行长期的航空任务,因为它们具有节能性,相对静默和安全。为了解决飞艇导航和控制任务,在以前的工作中,我们开发了一个硬件和软件框架框架,以及在有风干扰的情况下,用于大型平局的基于PID的控制器。但是,光线具有变形的结构,它们的动力学本质上是非线性的,并且延迟了,因此PID控制器难以调整。因此,通常会导致大量跟踪错误。此外,由于环境温度和压力的变化,飞艇的浮力正在不断变化。为了解决这些问题,在本文中,我们提出了一个基于学习的基于深度残留强化学习(DRRL)的基于学习的框架,以实施飞艇控制任务。在此框架内,我们首先采用PID控制器来提供基线性能。随后,DRRL代理学会通过与环境相互作用来修改PID决策。我们在模拟中证明了DRRL剂一致地改善了PID性能。通过严格的模拟实验,我们表明该试剂对风速和浮力的变化具有鲁棒性。在现实世界实验中,我们证明只有在模拟中训练的药物足以在大风条件下控制实际的飞艇。我们在https://github.com/ robot-ception-group/autonomousblimpdrl上公开提供方法的源代码。
Blimps are well suited to perform long-duration aerial tasks as they are energy efficient, relatively silent and safe. To address the blimp navigation and control task, in previous work we developed a hardware and software-in-the-loop framework and a PID-based controller for large blimps in the presence of wind disturbance. However, blimps have a deformable structure and their dynamics are inherently non-linear and time-delayed, making PID controllers difficult to tune. Thus, often resulting in large tracking errors. Moreover, the buoyancy of a blimp is constantly changing due to variations in ambient temperature and pressure. To address these issues, in this paper we present a learning-based framework based on deep residual reinforcement learning (DRRL), for the blimp control task. Within this framework, we first employ a PID controller to provide baseline performance. Subsequently, the DRRL agent learns to modify the PID decisions by interaction with the environment. We demonstrate in simulation that DRRL agent consistently improves the PID performance. Through rigorous simulation experiments, we show that the agent is robust to changes in wind speed and buoyancy. In real-world experiments, we demonstrate that the agent, trained only in simulation, is sufficiently robust to control an actual blimp in windy conditions. We openly provide the source code of our approach at https://github.com/ robot-perception-group/AutonomousBlimpDRL.