论文标题

RASE:随机子空间合奏分类

RaSE: Random Subspace Ensemble Classification

论文作者

Tian, Ye, Feng, Yang

论文摘要

我们提出了一个灵活的集合分类框架,随机子空间集合(RASE),以进行稀疏分类。在RASE算法中,我们汇总了许多弱学习者,其中每个弱学习者是在最佳从随机子空间中选择的子空间中训练的基本分类器。为了进行子空间选择,我们提出了一个新的标准,基于加权kullback-leibler差异的新标准,比率信息标准(RIC)。理论分析包括RASE分类器的风险和蒙特卡罗方差,建立RIC的筛选和弱一致性,并为Rase分类器的错误分类率提供了上限。此外,我们表明,在高维框架中,随机空间的数量必须非常大,以确保选择子空间覆盖信号。因此,我们提出了RASE算法的迭代版本,并证明在某些特定条件下,需要少量生成的随机子空间来通过迭代找到理想的子空间。在各种模型和Real-DATA应用程序下进行的一系列仿真证明了Rase分类器及其迭代版本的有效性和鲁棒性,这些版本在低分类率和准确的功能排名方面都证明了Rase分类器及其迭代版本。 RASE算法在Cran的R软件包中实现。

We propose a flexible ensemble classification framework, Random Subspace Ensemble (RaSE), for sparse classification. In the RaSE algorithm, we aggregate many weak learners, where each weak learner is a base classifier trained in a subspace optimally selected from a collection of random subspaces. To conduct subspace selection, we propose a new criterion, ratio information criterion (RIC), based on weighted Kullback-Leibler divergence. The theoretical analysis includes the risk and Monte-Carlo variance of the RaSE classifier, establishing the screening consistency and weak consistency of RIC, and providing an upper bound for the misclassification rate of the RaSE classifier. In addition, we show that in a high-dimensional framework, the number of random subspaces needs to be very large to guarantee that a subspace covering signals is selected. Therefore, we propose an iterative version of the RaSE algorithm and prove that under some specific conditions, a smaller number of generated random subspaces are needed to find a desirable subspace through iteration. An array of simulations under various models and real-data applications demonstrate the effectiveness and robustness of the RaSE classifier and its iterative version in terms of low misclassification rate and accurate feature ranking. The RaSE algorithm is implemented in the R package RaSEn on CRAN.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源