论文标题
在混乱的环境中学习最小时间飞行
Learning Minimum-Time Flight in Cluttered Environments
论文作者
论文摘要
在利用完整的四型动力学的同时,我们通过一系列路点解决了四型飞机的最低时间飞行问题。早期作品依赖于简化的动力学或多项式轨迹表示,而这些动力学或多项式轨迹表示,这些表示没有利用四四光的全部执行器电位,从而导致了次优溶液。最近的作品可以计划最小的时间轨迹;但是,轨迹是通过无法解释障碍的控制方法执行的。因此,由于模型不匹配和机上干扰,成功执行此类轨迹很容易出现错误。为此,我们利用深度强化学习和经典的拓扑路径计划来训练强大的神经网络控制器,以在混乱的环境中为最少的四分之一飞行。由此产生的神经网络控制器表明,与最新方法相比,最多19 \%的性能更好。更重要的是,学识渊博的政策同时在线解决了计划和控制问题,以解决干扰,从而实现了更高的鲁棒性。因此,提出的方法在没有碰撞的情况下实现了100%的最低时间政策成功率,而传统的计划和控制方法仅获得40%。所提出的方法在模拟和现实世界中均已验证,四速速度高达42公里/小时,加速度为3.6g。
We tackle the problem of minimum-time flight for a quadrotor through a sequence of waypoints in the presence of obstacles while exploiting the full quadrotor dynamics. Early works relied on simplified dynamics or polynomial trajectory representations that did not exploit the full actuator potential of the quadrotor, and, thus, resulted in suboptimal solutions. Recent works can plan minimum-time trajectories; yet, the trajectories are executed with control methods that do not account for obstacles. Thus, a successful execution of such trajectories is prone to errors due to model mismatch and in-flight disturbances. To this end, we leverage deep reinforcement learning and classical topological path planning to train robust neural-network controllers for minimum-time quadrotor flight in cluttered environments. The resulting neural network controller demonstrates substantially better performance of up to 19\% over state-of-the-art methods. More importantly, the learned policy solves the planning and control problem simultaneously online to account for disturbances, thus achieving much higher robustness. As such, the presented method achieves 100% success rate of flying minimum-time policies without collision, while traditional planning and control approaches achieve only 40%. The proposed method is validated in both simulation and the real world, with quadrotor speeds of up to 42km/h and accelerations of 3.6g.