多级预算组合问题的课程学习

论文标题

多级预算组合问题的课程学习

Curriculum learning for multilevel budgeted combinatorial problems

论文作者

Nabli, Adel, Carvalho, Margarida

论文摘要

通过图神经网络学习组合优化问题的启发式方法最近显示了一些经典的NP硬性问题的有希望的结果。这些是只有一个玩家的单层优化问题。多级组合优化问题是它们的概括，涵盖了多个参与者顺序做出决策的情况。通过在多代理增强学习设置中构建它们，我们设计了一种基于价值的方法来学习解决多级预算的组合问题，涉及两个玩家在图中涉及零和零游戏。我们的框架是基于简单的课程：如果代理商知道如何估计预算高达$ b $的实例的价值，那么无论检查每个可能的余后的价值，都可以在多项式时间内完成预算$ b+1 $的实例。因此，在一种自下而上的方法中，我们生成了启发式解决实例的数据集，这些实例越来越大，以训练我们的代理商。我们报告的结果接近最高$ 100 $节点的图表和$ 185 \ times $ $速度的速度与多级临界节点问题已知的最快确切求解器相比，最大的最大最大速率问题是至少$σ_2^p $ - hard。

Learning heuristics for combinatorial optimization problems through graph neural networks have recently shown promising results on some classic NP-hard problems. These are single-level optimization problems with only one player. Multilevel combinatorial optimization problems are their generalization, encompassing situations with multiple players taking decisions sequentially. By framing them in a multi-agent reinforcement learning setting, we devise a value-based method to learn to solve multilevel budgeted combinatorial problems involving two players in a zero-sum game over a graph. Our framework is based on a simple curriculum: if an agent knows how to estimate the value of instances with budgets up to $B$, then solving instances with budget $B+1$ can be done in polynomial time regardless of the direction of the optimization by checking the value of every possible afterstate. Thus, in a bottom-up approach, we generate datasets of heuristically solved instances with increasingly larger budgets to train our agent. We report results close to optimality on graphs up to $100$ nodes and a $185 \times$ speedup on average compared to the quickest exact solver known for the Multilevel Critical Node problem, a max-min-max trilevel problem that has been shown to be at least $Σ_2^p$-hard.

下载PDF全文

下载文献需遵守相关版权规定

论文标题