学习通过Lagrangian放松解决软件限制的车辆路由问题

论文标题

学习通过Lagrangian放松解决软件限制的车辆路由问题

Learning to Solve Soft-Constrained Vehicle Routing Problems with Lagrangian Relaxation

论文作者

Tang, Qiaoyue, Kong, Yangzhe, Pan, Lemeng, Lee, Choonmeng

论文摘要

现实世界应用中的车辆路由问题（VRP）通常会带有各种限制，因此为精确的解决方案方法或启发式搜索方法带来了其他计算挑战。从样本数据中学习启发式移动模式的最新想法已经变得越来越有希望减少解决方案发展成本。但是，使用基于学习的方法来解决更多类型的受限VRP仍然是一个挑战。困难在于在寻找最佳解决方案时控制约束违规。为了克服这一挑战，我们提出了一种基于加强学习的方法，通过纳入Lagrangian放松技术并使用受限的策略优化来解决软件限制的VRP。我们将该方法应用于三种常见类型的VRP，旅行推销员问题（TSPTW），电容的VRP（CVRP）和带有时间窗口（CVRPTW）的电容VRP，以显示所提出的方法的推广性。在与现有的基于RL的方法和开源启发式求解器进行比较之后，我们展示了其在旅行距离，约束违规和推理速度方面良好平衡的解决方案方面的竞争性能。

Vehicle Routing Problems (VRPs) in real-world applications often come with various constraints, therefore bring additional computational challenges to exact solution methods or heuristic search approaches. The recent idea to learn heuristic move patterns from sample data has become increasingly promising to reduce solution developing costs. However, using learning-based approaches to address more types of constrained VRP remains a challenge. The difficulty lies in controlling for constraint violations while searching for optimal solutions. To overcome this challenge, we propose a Reinforcement Learning based method to solve soft-constrained VRPs by incorporating the Lagrangian relaxation technique and using constrained policy optimization. We apply the method on three common types of VRPs, the Travelling Salesman Problem with Time Windows (TSPTW), the Capacitated VRP (CVRP) and the Capacitated VRP with Time Windows (CVRPTW), to show the generalizability of the proposed method. After comparing to existing RL-based methods and open-source heuristic solvers, we demonstrate its competitive performance in finding solutions with a good balance in travel distance, constraint violations and inference speed.

下载PDF全文

下载文献需遵守相关版权规定

论文标题