论文标题
Flexibo:深层神经网络的成本感知的多目标优化方法
FlexiBO: A Decoupled Cost-Aware Multi-Objective Optimization Approach for Deep Neural Networks
论文作者
论文摘要
机器学习系统的设计通常需要交易不同的目标,例如,深度神经网络(DNN)的预测错误和能耗。通常,在所有目标中都没有单一的设计表现良好。因此,找到帕累托最佳设计是有意义的。搜索帕累托最佳设计涉及在迭代过程中评估设计,并使用测量结果评估指导搜索过程的采集函数。但是,测量不同的目标会产生不同的成本。例如,测量DNN的预测误差的成本比测量预训练的DNN的能耗高的数量级高,因为它需要重新训练DNN。当前的最新方法没有考虑客观评估成本的这种差异,可能会产生对优化过程中客观功能的昂贵评估。在本文中,我们开发了一种新颖的分离和成本吸引的多目标优化算法,我们称灵活性多目标贝叶斯优化(Flexibo)来解决此问题。 Flexibo通过每个目标的测量成本来改善帕累托地区的超量,以平衡收集新信息的费用与通过客观评估获得的知识的费用,从而使我们无法以几乎没有收益进行昂贵的测量。我们在七个最先进的DNN上评估Flexibo,以识别图像识别,自然语言处理(NLP)和语音到文本翻译。我们的结果表明,鉴于相同的实验预算,Flexibo发现设计的设计是4.8 $ \%至12.4 $ \%$ \%$ hypervolume错误,而不是最先进的多目标优化的最佳方法。
The design of machine learning systems often requires trading off different objectives, for example, prediction error and energy consumption for deep neural networks (DNNs). Typically, no single design performs well in all objectives; therefore, finding Pareto-optimal designs is of interest. The search for Pareto-optimal designs involves evaluating designs in an iterative process, and the measurements are used to evaluate an acquisition function that guides the search process. However, measuring different objectives incurs different costs. For example, the cost of measuring the prediction error of DNNs is orders of magnitude higher than that of measuring the energy consumption of a pre-trained DNN, as it requires re-training the DNN. Current state-of-the-art methods do not consider this difference in objective evaluation cost, potentially incurring expensive evaluations of objective functions in the optimization process. In this paper, we develop a novel decoupled and cost-aware multi-objective optimization algorithm, we call Flexible Multi-Objective Bayesian Optimization (FlexiBO) to address this issue. FlexiBO weights the improvement of the hypervolume of the Pareto region by the measurement cost of each objective to balance the expense of collecting new information with the knowledge gained through objective evaluations, preventing us from performing expensive measurements for little to no gain. We evaluate FlexiBO on seven state-of-the-art DNNs for image recognition, natural language processing (NLP), and speech-to-text translation. Our results indicate that, given the same total experimental budget, FlexiBO discovers designs with 4.8$\%$ to 12.4$\%$ lower hypervolume error than the best method in state-of-the-art multi-objective optimization.