论文标题
一种培训密集神经网络的混合企业编程方法
A Mixed-Integer Programming Approach to Training Dense Neural Networks
论文作者
论文摘要
人工神经网络(ANN)是普遍的机器学习模型,这些模型被应用于各种现实世界分类任务。但是,训练ANN耗时,由此产生的模型需要大量的内存才能部署。为了训练更多的简约的ANN,我们提出了一种新型的混合组件编程(MIP)配方,用于训练完全连接的ANN。我们的配方可以解释二进制和整流线性单元(relu)激活,以及使用对数似然损失。我们介绍了数值实验,将基于MIP的方法与现有方法进行比较,并表明我们能够通过更典型的模型来实现竞争性的样本外性能。
Artificial Neural Networks (ANNs) are prevalent machine learning models that are applied across various real-world classification tasks. However, training ANNs is time-consuming and the resulting models take a lot of memory to deploy. In order to train more parsimonious ANNs, we propose a novel mixed-integer programming (MIP) formulation for training fully-connected ANNs. Our formulations can account for both binary and rectified linear unit (ReLU) activations, and for the use of a log-likelihood loss. We present numerical experiments comparing our MIP-based methods against existing approaches and show that we are able to achieve competitive out-of-sample performance with more parsimonious models.