论文标题
部分可观测时空混沌系统的无模型预测
The Environmental Discontinuity Hypothesis for Down-Sampled Lexicase Selection
论文作者
论文摘要
长期以来,向下采样训练数据已被证明可以改善广泛的机器学习系统的概括性能。最近,向下采样已证明在利用词汇酶母体选择技术的基因编程(GP)中有效。尽管已证明这种减小程序可显着改善各种问题的性能,但由于通过环境变化鼓励适应性,似乎并没有这样做。我们假设执行的每一代人进行的随机抽样会导致不连续性,从而导致人口无法适应变化的环境。我们研究了对下采样的词汇选择的修改,以期通过减少连续一代环境之间的不连续性的量来促进脚手架演变的增量变化。在我们的实证研究中,我们发现,对于不断发展的解决方案来编程综合问题而不是简单的随机下采样,强迫增量的环境变化并没有好得多。为此,我们试图通过仅使用脱节下样本来查看是否阻碍性能来加剧不连续性的假设患病率。我们发现,这也与常规随机下采样的性能没有显着差异。这些负面结果提出了有关子样本组成的方式(可能包括同义案例)的新问题,可能会影响使用减速采样的机器学习系统的性能。
Down-sampling training data has long been shown to improve the generalization performance of a wide range of machine learning systems. Recently, down-sampling has proved effective in genetic programming (GP) runs that utilize the lexicase parent selection technique. Although this down-sampling procedure has been shown to significantly improve performance across a variety of problems, it does not seem to do so due to encouraging adaptability through environmental change. We hypothesize that the random sampling that is performed every generation causes discontinuities that result in the population being unable to adapt to the shifting environment. We investigate modifications to down-sampled lexicase selection in hopes of promoting incremental environmental change to scaffold evolution by reducing the amount of jarring discontinuities between the environments of successive generations. In our empirical studies, we find that forcing incremental environmental change is not significantly better for evolving solutions to program synthesis problems than simple random down-sampling. In response to this, we attempt to exacerbate the hypothesized prevalence of discontinuities by using only disjoint down-samples to see if it hinders performance. We find that this also does not significantly differ from the performance of regular random down-sampling. These negative results raise new questions about the ways in which the composition of sub-samples, which may include synonymous cases, may be expected to influence the performance of machine learning systems that use down-sampling.