嵌套交叉验证的合并修剪以加速自动化超参数优化，以在具有很小样本量的高维数据中进行嵌入式特征选择

论文标题

嵌套交叉验证的合并修剪以加速自动化超参数优化，以在具有很小样本量的高维数据中进行嵌入式特征选择

Combined Pruning for Nested Cross-Validation to Accelerate Automated Hyperparameter Optimization for Embedded Feature Selection in High-Dimensional Data with Very Small Sample Sizes

论文作者

May, Sigrun, Hartmann, Sven, Klawonn, Frank

论文摘要

背景：具有很小样本量的高维数据中的嵌入式特征选择需要优化模型构建过程的超参数。对于此超参数优化，必须应用嵌套的交叉验证以避免偏向性能估计。由高维数据进行的重复训练导致了很长的计算时间。此外，它很可能会观察到由小验证集中的异常值引起的个体性能评估指标的较高差异。因此，早期停止应用标准的修剪算法来节省时间风险，以丢弃有希望的超参数集。结果：为了加快样本量较小的高维数据的特征选择，我们适应了最先进的异步连续的休息器。此外，我们将其与基于领域或先验知识的两种补充修剪策略相结合。一种修剪策略立即停止对所选超参数组合的语义上毫无意义的结果进行计算试验。另一个是一种新的外推阈值修剪策略，适用于具有较大性能评估指标差异的嵌套交叉验证。在反复的实验中，我们的组合修剪策略保持了所有有前途的试验。同时，与仅使用最新的连续减半pruner相比，计算时间大大减少。训练的型号最多少81.3％，从而获得了相同的优化结果。结论：所提出的组合修剪策略可以加速数据分析或在同一计算时间内更深入地搜索超参数。这导致了时间，金钱和能源消耗大量节省，为高级，耗时的分析打开了大门。

Background: Embedded feature selection in high-dimensional data with very small sample sizes requires optimized hyperparameters for the model building process. For this hyperparameter optimization, nested cross-validation must be applied to avoid a biased performance estimation. The resulting repeated training with high-dimensional data leads to very long computation times. Moreover, it is likely to observe a high variance in the individual performance evaluation metrics caused by outliers in tiny validation sets. Therefore, early stopping applying standard pruning algorithms to save time risks discarding promising hyperparameter sets. Result: To speed up feature selection for high-dimensional data with tiny sample size, we adapt the use of a state-of-the-art asynchronous successive halving pruner. In addition, we combine it with two complementary pruning strategies based on domain or prior knowledge. One pruning strategy immediately stops computing trials with semantically meaningless results for the selected hyperparameter combinations. The other is a new extrapolating threshold pruning strategy suitable for nested-cross-validation with a high variance of performance evaluation metrics. In repeated experiments, our combined pruning strategy keeps all promising trials. At the same time, the calculation time is substantially reduced compared to using a state-of-the-art asynchronous successive halving pruner alone. Up to 81.3\% fewer models were trained achieving the same optimization result. Conclusion: The proposed combined pruning strategy accelerates data analysis or enables deeper searches for hyperparameters within the same computation time. This leads to significant savings in time, money and energy consumption, opening the door to advanced, time-consuming analyses.

下载PDF全文

下载文献需遵守相关版权规定

论文标题