论文标题
接地群的连续深层分层增强学习
Continuous Deep Hierarchical Reinforcement Learning for Ground-Air Swarm Shepherding
论文作者
论文摘要
多机器人(Swarm)的控制和指导是一个非平凡的问题,因为组之间耦合相互作用的复杂性。无论群是合作还是不合作,都可以从绵羊放牧的羊皮中吸取教训。牧羊的仿生物提供了用于群体控制的计算方法,具有在不同环境中概括和扩展的潜力。但是,由于机器学习者面临的较大搜索空间,学习牧羊人很复杂。我们为牧羊人提供了一种深层的分层增强学习方法,在该方法中,无人驾驶汽车(UAV)学会充当一种空中绵羊,以控制和指导一群无人驾驶的地面车辆(UGV)。该方法扩展了我们先前关于机器教育的工作,将搜索空间分解为层次结构的课程。课程中的每个课程都是通过深入的强化学习模型来学到的。层次结构是通过融合模型的输出来形成的。该方法首先在基于机器人的机器人系统(ROS)的模拟环境中首先证明,然后在室内测试设施中使用物理UGV和无人机。我们研究了该方法在模型从模拟转移到现实世界的过程中以及模型从一个量表移动到另一个量表的能力。
The control and guidance of multi-robots (swarm) is a non-trivial problem due to the complexity inherent in the coupled interaction among the group. Whether the swarm is cooperative or non-cooperative, lessons can be learnt from sheepdogs herding sheep. Biomimicry of shepherding offers computational methods for swarm control with the potential to generalize and scale in different environments. However, learning to shepherd is complex due to the large search space that a machine learner is faced with. We present a deep hierarchical reinforcement learning approach for shepherding, whereby an unmanned aerial vehicle (UAV) learns to act as an aerial sheepdog to control and guide a swarm of unmanned ground vehicles (UGVs). The approach extends our previous work on machine education to decompose the search space into a hierarchically organized curriculum. Each lesson in the curriculum is learnt by a deep reinforcement learning model. The hierarchy is formed by fusing the outputs of the model. The approach is demonstrated first in a high-fidelity robotic-operating-system (ROS)-based simulation environment, then with physical UGVs and a UAV in an in-door testing facility. We investigate the ability of the method to generalize as the models move from simulation to the real-world and as the models move from one scale to another.