论文标题
在随机投影高维特征上的共识聚集以回归
Consensual Aggregation on Random Projected High-dimensional Features for Regression
论文作者
论文摘要
在本文中,我们介绍了基于内核的共识聚集的研究,该共识集合对回归预测的随机投影高度特征。汇总方案由两个步骤组成:预测的高维特征(由大量回归估计器给出,在第一步中使用Johnson-Lindenstrauss Lemma随机投影到一个较小的子空间中,并且在第二步的投影特征上实现了基于内核的共识汇总。从理论上讲,我们表明,聚合方案的性能接近在原始高维特征上实现的聚合的性能,具有很高的可能性。此外,我们从数值上说明,汇总方案将其性能维持在不同类型的机器给出的非常大且高度相关的预测特征上。聚合方案使我们能够灵活合并大量的冗余机器,这些冗余机明确构造而没有模型选择或交叉验证。通过对不同类型的合成和实际数据集进行评估的几个实验来说明该方法的效率。
In this paper, we present a study of a kernel-based consensual aggregation on randomly projected high-dimensional features of predictions for regression. The aggregation scheme is composed of two steps: the high-dimensional features of predictions, given by a large number of regression estimators, are randomly projected into a smaller subspace using Johnson-Lindenstrauss Lemma in the first step, and a kernel-based consensual aggregation is implemented on the projected features in the second step. We theoretically show that the performance of the aggregation scheme is close to the performance of the aggregation implemented on the original high-dimensional features, with high probability. Moreover, we numerically illustrate that the aggregation scheme upholds its performance on very large and highly correlated features of predictions given by different types of machines. The aggregation scheme allows us to flexibly merge a large number of redundant machines, plainly constructed without model selection or cross-validation. The efficiency of the proposed method is illustrated through several experiments evaluated on different types of synthetic and real datasets.