Relu激活的多层神经网络，通过混合整数线性程序训练

论文标题

Relu激活的多层神经网络，通过混合整数线性程序训练

ReLU activated Multi-Layer Neural Networks trained with Mixed Integer Linear Programs

论文作者

Goebbels, Steffen

论文摘要

在本文中，通过一个案例研究证明了由Relu功能激活的多层馈电神经网络原则上可以通过以下方式迭代迭代迭代培训。通过批处理确定权重。每批培训数据使用多个迭代。在每次迭代中，算法始于输出层，并将信息传播回第一个隐藏层，以使用MILP或线性程序调节权重。对于每一层，目标是最大程度地减少其输出与相应目标输出之间的差异。最后一个（输出）层的目标输出等于地面真相。上一层的目标输出定义为以下一层的调整后输入。对于给定的层，通过求解MILP来计算权重。然后，除了第一层隐藏层外，输入值还通过MILP修改，以更好地匹配图层输出与其相应的目标输出。测试了该方法，并使用包含手写数字的MNIST数据集上的两个简单网络与Tensorflow/keras（ADAM Optimizer）进行了比较。达到了与Tensorflow/Keras相同大小的精度。

In this paper, it is demonstrated through a case study that multilayer feedforward neural networks activated by ReLU functions can in principle be trained iteratively with Mixed Integer Linear Programs (MILPs) as follows. Weights are determined with batch learning. Multiple iterations are used per batch of training data. In each iteration, the algorithm starts at the output layer and propagates information back to the first hidden layer to adjust the weights using MILPs or Linear Programs. For each layer, the goal is to minimize the difference between its output and the corresponding target output. The target output of the last (output) layer is equal to the ground truth. The target output of a previous layer is defined as the adjusted input of the following layer. For a given layer, weights are computed by solving a MILP. Then, except for the first hidden layer, the input values are also modified with a MILP to better match the layer outputs to their corresponding target outputs. The method was tested and compared with Tensorflow/Keras (Adam optimizer) using two simple networks on the MNIST dataset containing handwritten digits. Accuracies of the same magnitude as with Tensorflow/Keras were achieved.

下载PDF全文

下载文献需遵守相关版权规定

论文标题