论文标题
序数分类的可变选择和基础学习
Variable selection and basis learning for ordinal classification
论文作者
论文摘要
我们提出了一种用于可变选择和基础学习的方法,用于高维分类和顺序响应。所提出的方法扩展了稀疏的多类线性判别分析,目的不仅要识别与歧视相关的变量,还要识别与响应有序符合的变量。为此,我们为每个变量计算一个顺序的重量,其中将较大的权重给具有有序组均值的变量,并更严重地对变量进行惩罚。开发了两步构造的顺序权重构造,我们表明序数权重正确地将序数变量与概率很高的非内部变量分开。根据可调参数的选择,结果表明所得稀疏的序数学习方法可始终如一地选择判别变量或序数和判别变量。这样的渐近保证是在高维渐近方案下给出的,该方案的增长速度比样本量快得多。我们还讨论了所选判别变量之间的筛选后序数变量的两步程序。模拟和真实数据分析证实,所提出的基础学习提供了稀疏且可解释的基础,因为它主要由序数变量组成。
We propose a method for variable selection and basis learning for high-dimensional classification with ordinal responses. The proposed method extends sparse multiclass linear discriminant analysis, with the aim of identifying not only the variables relevant to discrimination but also the variables that are order-concordant with the responses. For this purpose, we compute for each variable an ordinal weight, where larger weights are given to variables with ordered group-means, and penalize the variables with smaller weights more severely. A two-step construction for ordinal weights is developed, and we show that the ordinal weights correctly separate ordinal variables from non-ordinal variables with high probability. The resulting sparse ordinal basis learning method is shown to consistently select either the discriminant variables or the ordinal and discriminant variables, depending on the choice of a tunable parameter. Such asymptotic guarantees are given under a high-dimensional asymptotic regime where the dimension grows much faster than the sample size. We also discuss a two-step procedure of post-screening ordinal variables among the selected discriminant variables. Simulated and real data analyses confirm that the proposed basis learning provides sparse and interpretable basis, as it mostly consists of ordinal variables.