小数据，重大决定：小型数据制度中的模型选择

论文标题

小数据，重大决定：小型数据制度中的模型选择

Small Data, Big Decisions: Model Selection in the Small-Data Regime

论文作者

Bornschein, Jorg, Visin, Francesco, Osindero, Simon

论文摘要

高度过度兼容的神经网络可以表现出奇怪的强烈概括性能 - 这种现象最近获得了大量的理论和经验研究，以便更好地理解它。与大多数以前的工作相反，通常将性能视为模型大小的函数，在本文中，我们从经验上研究了概括性能，因为训练集的大小在多个数量级上变化。这些系统的实验导致了一些有趣且可能非常有用的观察结果。也许最值得注意的是，对数据的较小子集进行培训可能会导致更可靠的模型选择决策，同时享受较小的计算成本。此外，我们的实验使我们能够估算给定现代神经网络体系结构的通用数据集的最小描述长度，从而考虑了OCCAMS-RAZOR，为原则模型选择铺平了道路。

Highly overparametrized neural networks can display curiously strong generalization performance - a phenomenon that has recently garnered a wealth of theoretical and empirical research in order to better understand it. In contrast to most previous work, which typically considers the performance as a function of the model size, in this paper we empirically study the generalization performance as the size of the training set varies over multiple orders of magnitude. These systematic experiments lead to some interesting and potentially very useful observations; perhaps most notably that training on smaller subsets of the data can lead to more reliable model selection decisions whilst simultaneously enjoying smaller computational costs. Our experiments furthermore allow us to estimate Minimum Description Lengths for common datasets given modern neural network architectures, thereby paving the way for principled model selection taking into account Occams-razor.

下载PDF全文

下载文献需遵守相关版权规定

论文标题