强化学习以优化无人机的物流分配路线

论文标题

强化学习以优化无人机的物流分配路线

Reinforcement Learning to Optimize the Logistics Distribution Routes of Unmanned Aerial Vehicle

论文作者

Feng, Linfei

论文摘要

无人驾驶飞机（UAV）在商品交付中的路径规划方法引起了行业和学者的极大关注，因为它的灵活性适用于客户和交付节点之间“最后一公里”中的许多情况。但是，复杂的情况仍然是传统组合优化方法的问题。基于最新的增强学习（RL），本文提出了一种改进的方法，以实现复杂环境中无人机的路径计划：多个无飞行区。改进的方法利用了注意机制，并将嵌入机理作为编码器和三个不同的梁搜索宽度（即〜1、5和10）作为解码器。政策梯度用于训练RL模型，以在推断过程中获得最佳策略。结果表明，在这种复杂情况下应用模型的可行性和效率。将模型与优化求解器或工具获得的结果进行比较，它提高了分配系统的可靠性，并具有对无人机的广泛应用具有指导意义。

Path planning methods for the unmanned aerial vehicle (UAV) in goods delivery have drawn great attention from industry and academics because of its flexibility which is suitable for many situations in the "Last Kilometer" between customer and delivery nodes. However, the complicated situation is still a problem for traditional combinatorial optimization methods. Based on the state-of-the-art Reinforcement Learning (RL), this paper proposed an improved method to achieve path planning for UAVs in complex surroundings: multiple no-fly zones. The improved approach leverages the attention mechanism and includes the embedding mechanism as the encoder and three different widths of beam search (i.e.,~1, 5, and 10) as the decoders. Policy gradients are utilized to train the RL model for obtaining the optimal strategies during inference. The results show the feasibility and efficiency of the model applying in this kind of complicated situation. Comparing the model with the results obtained by the optimization solver OR-tools, it improves the reliability of the distribution system and has a guiding significance for the broad application of UAVs.

下载PDF全文

下载文献需遵守相关版权规定

论文标题