以决策为中心的学习没有可区分的优化：学习本地优化的决策损失

论文标题

以决策为中心的学习没有可区分的优化：学习本地优化的决策损失

Decision-Focused Learning without Differentiable Optimization: Learning Locally Optimized Decision Losses

论文作者

Shah, Sanket, Wang, Kai, Wilder, Bryan, Perrault, Andrew, Tambe, Milind

论文摘要

以决策为中心的学习（DFL）是将预测模型量身定制到下游优化任务的范式，该任务使用其预测以更好地执行该特定任务。与DFL相关的主要技术挑战是，它需要能够通过优化问题进行区分，这是由于不连续的解决方案和其他挑战很难。过去的工作在很大程度上已经解决了这一问题，通过手工制作特定于任务的替代物，可以在差异化时提供提供信息梯度的原始优化问题。但是，需要为每个新任务进行手工替代的需要限制了DFL的可用性。此外，通常无法保证产生的替代物的凸度，因此，训练使用它们的预测模型会导致局部优势较低。在本文中，我们完全消除了代理人，而是学习捕获特定于任务信息的损失功能。据我们所知，我们的方法是第一种完全替代以决策为中心学习的优化组成部分，自动学习的损失。我们的方法（a）只需要访问可以解决优化问题并因此可以推广的黑盒甲骨文，并且可以通过构造来传播，因此可以轻松地优化。我们对文献中的三个资源分配问题进行了评估，发现我们的方法在没有考虑到所有三个领域的任务结构，甚至是文献中手工制作的代理人中都超过了学习的方法。

Decision-Focused Learning (DFL) is a paradigm for tailoring a predictive model to a downstream optimization task that uses its predictions in order to perform better on that specific task. The main technical challenge associated with DFL is that it requires being able to differentiate through the optimization problem, which is difficult due to discontinuous solutions and other challenges. Past work has largely gotten around this issue by handcrafting task-specific surrogates to the original optimization problem that provide informative gradients when differentiated through. However, the need to handcraft surrogates for each new task limits the usability of DFL. In addition, there are often no guarantees about the convexity of the resulting surrogates and, as a result, training a predictive model using them can lead to inferior local optima. In this paper, we do away with surrogates altogether and instead learn loss functions that capture task-specific information. To the best of our knowledge, ours is the first approach that entirely replaces the optimization component of decision-focused learning with a loss that is automatically learned. Our approach (a) only requires access to a black-box oracle that can solve the optimization problem and is thus generalizable, and (b) can be convex by construction and so can be easily optimized over. We evaluate our approach on three resource allocation problems from the literature and find that our approach outperforms learning without taking into account task structure in all three domains, and even hand-crafted surrogates from the literature.

下载PDF全文

下载文献需遵守相关版权规定

论文标题