使用HyperNetworks学习帕累托阵线

论文标题

使用HyperNetworks学习帕累托阵线

Learning the Pareto Front with Hypernetworks

论文作者

Navon, Aviv, Shamsian, Aviv, Chechik, Gal, Fetaya, Ethan

论文摘要

多目标优化（MOO）问题在机器学习中很普遍。这些问题具有一组最佳解决方案，称为Pareto Front，其中前面的每个点代表可能冲突的目标之间的不同权衡。最近的MOO方法可以针对损失空间中的特定所需射线，但是，大多数方法仍然面临两个严重的限制：（i）必须为前面的每个点培训单独的模型；（ii）必须在优化过程之前知道确切的权衡。在这里，我们解决了学习整个帕累托阵线的问题，并且能够在训练后在正面选择所需的操作点。我们称这种新的设置帕累托 - 前学习（PFL）。我们描述了使用HyperNetworks实施的PFL的方法，我们将其称为Pareto Hyper NeTworks（PHNS）。 PHN使用单个超网络同时学习整个Pareto前部，该净值接收到所需的优先载体，并返回帕累托最佳模型，其损失向量在所需的射线中。与训练多个模型相比，统一模型是有效的，并将其推广到训练过程中未使用的新操作点。从多任务回归和分类到公平性，我们将我们的方法评估了各种问题。 PHNS大约在前面学习一个点的同时学习整个Pareto前沿，同时达到了更好的解决方案。此外，我们表明PHN可以扩展以生成像Resnet18这样的大型模型。 PFL打开了新应用程序的大门，其中仅根据运行时可用的偏好选择模型。

Multi-objective optimization (MOO) problems are prevalent in machine learning. These problems have a set of optimal solutions, called the Pareto front, where each point on the front represents a different trade-off between possibly conflicting objectives. Recent MOO methods can target a specific desired ray in loss space however, most approaches still face two grave limitations: (i) A separate model has to be trained for each point on the front; and (ii) The exact trade-off must be known before the optimization process. Here, we tackle the problem of learning the entire Pareto front, with the capability of selecting a desired operating point on the front after training. We call this new setup Pareto-Front Learning (PFL). We describe an approach to PFL implemented using HyperNetworks, which we term Pareto HyperNetworks (PHNs). PHN learns the entire Pareto front simultaneously using a single hypernetwork, which receives as input a desired preference vector and returns a Pareto-optimal model whose loss vector is in the desired ray. The unified model is runtime efficient compared to training multiple models and generalizes to new operating points not used during training. We evaluate our method on a wide set of problems, from multi-task regression and classification to fairness. PHNs learn the entire Pareto front at roughly the same time as learning a single point on the front and at the same time reach a better solution set. Furthermore, we show that PHNs can scale to generate large models like ResNet18. PFL opens the door to new applications where models are selected based on preferences that are only available at run time.

下载PDF全文

下载文献需遵守相关版权规定

论文标题