论文标题

概括和风险 - 内向曲线

Generalisation and the Risk--Entropy Curve

论文作者

Belcher, Dominic, Marcu, Antonia, Prügel-Bennett, Adam

论文摘要

在本文中,我们表明,学习机器的预期概括性能取决于风险的分布或等效的对数 - 我们称风险熵的数量 - 以及我们称为训练比率的数量的波动。我们表明,使用马尔可夫链蒙特卡洛技术,可以从经验上推断出风险熵。关于各种问题的不同深层神经网络的结果将显示。风险熵的渐近行为与学习机器的能力相似,但是在实践情况下所经历的概括性能取决于达到渐近性制度之前风险熵的行为。这种性能在很大程度上取决于数据的分布(功能和目标),而不仅仅是学习机器的能力。

In this paper we show that the expected generalisation performance of a learning machine is determined by the distribution of risks or equivalently its logarithm -- a quantity we term the risk entropy -- and the fluctuations in a quantity we call the training ratio. We show that the risk entropy can be empirically inferred for deep neural network models using Markov Chain Monte Carlo techniques. Results are presented for different deep neural networks on a variety of problems. The asymptotic behaviour of the risk entropy acts in an analogous way to the capacity of the learning machine, but the generalisation performance experienced in practical situations is determined by the behaviour of the risk entropy before the asymptotic regime is reached. This performance is strongly dependent on the distribution of the data (features and targets) and not just on the capacity of the learning machine.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源