一种培训密集神经网络的混合企业编程方法

论文标题

一种培训密集神经网络的混合企业编程方法

A Mixed-Integer Programming Approach to Training Dense Neural Networks

论文作者

Patil, Vrishabh, Mintz, Yonatan

论文摘要

人工神经网络（ANN）是普遍的机器学习模型，这些模型被应用于各种现实世界分类任务。但是，训练ANN耗时，由此产生的模型需要大量的内存才能部署。为了训练更多的简约的ANN，我们提出了一种新型的混合组件编程（MIP）配方，用于训练完全连接的ANN。我们的配方可以解释二进制和整流线性单元（relu）激活，以及使用对数似然损失。我们介绍了数值实验，将基于MIP的方法与现有方法进行比较，并表明我们能够通过更典型的模型来实现竞争性的样本外性能。

Artificial Neural Networks (ANNs) are prevalent machine learning models that are applied across various real-world classification tasks. However, training ANNs is time-consuming and the resulting models take a lot of memory to deploy. In order to train more parsimonious ANNs, we propose a novel mixed-integer programming (MIP) formulation for training fully-connected ANNs. Our formulations can account for both binary and rectified linear unit (ReLU) activations, and for the use of a log-likelihood loss. We present numerical experiments comparing our MIP-based methods against existing approaches and show that we are able to achieve competitive out-of-sample performance with more parsimonious models.

下载PDF全文

下载文献需遵守相关版权规定

论文标题