论文标题
自动神经网络超参数优化用于外推:从可见和近红外光谱中学到的经验教训
Automatic Neural Network Hyperparameter Optimization for Extrapolation: Lessons Learned from Visible and Near-Infrared Spectroscopy of Mango Fruit
论文作者
论文摘要
神经网络是通过选择体系结构和超参数值来配置的;这样做通常涉及专家的直觉和手工调整,以找到一种可以很好地推断出不适合的配置。本文考虑了用于配置神经网络的自动方法,该方法会及时推断可见和近红外(VNIR)光谱的域。特别是,我们研究了(a)选择样品以验证配置以及(b)使用合奏的效果。 在大多数情况下,模型是建立的,以预测未来。为了鼓励神经网络模型推断,我们考虑在类似于测试集的时间移动的样品上验证模型配置。我们尝试三个验证集选择:(1)使用最新的1/3(按时间分类)的1/3非检验数据的随机样本(以前的工作中使用的技术),以及(3)使用语义上有意义的数据子集。超参数优化依赖于验证设置来估计测试集误差,但是神经网络方差使真实误差值相混淆。合奏平均 - 计算许多神经网络的平均值 - 可以减少预测错误的差异。 为了测试这些方法,我们全面研究了从前三年开始的VNIR光谱,持有的2018年芒果果实收获季节。我们发现结合可以提高最新模型的差异和准确性。此外,超参数优化实验 - 在有和没有集合平均的情况下,并且每个验证集选择 - 表明,当结合使用最新的1/3样本作为验证集时,自动发现神经网络配置与态度与现状相当。
Neural networks are configured by choosing an architecture and hyperparameter values; doing so often involves expert intuition and hand-tuning to find a configuration that extrapolates well without overfitting. This paper considers automatic methods for configuring a neural network that extrapolates in time for the domain of visible and near-infrared (VNIR) spectroscopy. In particular, we study the effect of (a) selecting samples for validating configurations and (b) using ensembles. Most of the time, models are built of the past to predict the future. To encourage the neural network model to extrapolate, we consider validating model configurations on samples that are shifted in time similar to the test set. We experiment with three validation set choices: (1) a random sample of 1/3 of non-test data (the technique used in previous work), (2) using the latest 1/3 (sorted by time), and (3) using a semantically meaningful subset of the data. Hyperparameter optimization relies on the validation set to estimate test-set error, but neural network variance obfuscates the true error value. Ensemble averaging - computing the average across many neural networks - can reduce the variance of prediction errors. To test these methods, we do a comprehensive study of a held-out 2018 harvest season of mango fruit given VNIR spectra from 3 prior years. We find that ensembling improves the state-of-the-art model's variance and accuracy. Furthermore, hyperparameter optimization experiments - with and without ensemble averaging and with each validation set choice - show that when ensembling is combined with using the latest 1/3 of samples as the validation set, a neural network configuration is found automatically that is on par with the state-of-the-art.