论文标题

高维数据的压缩分类框架

A Compressive Classification Framework for High-Dimensional Data

论文作者

Tabassum, Muhammad Naveed, Ollila, Esa

论文摘要

我们为数据维度明显高于样本量的设置提出了一个压缩分类框架。所提出的方法(称为压缩正规化判别分析(CRDA))基于线性判别分析,并且能够通过使用促进判别规则中促进硬阈值来选择重要特征。由于特征的数量大于样本量,因此该方法还使用最新的正规样品协方差矩阵估计器。关于实际数据集的几个分析示例,包括图像,语音信号和基因表达数据,说明了实践中提出的CRDA分类器提供的有希望的改进。总体而言,所提出的方法给出的错误分类错误比其竞争对手更少,同时达到准确的特征选择结果。提出的方法的开源R软件包和MATLAB工具箱(命名为CompressiverDa)可以免费使用。

We propose a compressive classification framework for settings where the data dimensionality is significantly higher than the sample size. The proposed method, referred to as compressive regularized discriminant analysis (CRDA) is based on linear discriminant analysis and has the ability to select significant features by using joint-sparsity promoting hard thresholding in the discriminant rule. Since the number of features is larger than the sample size, the method also uses state-of-the-art regularized sample covariance matrix estimators. Several analysis examples on real data sets, including image, speech signal and gene expression data illustrate the promising improvements offered by the proposed CRDA classifier in practise. Overall, the proposed method gives fewer misclassification errors than its competitors, while at the same time achieving accurate feature selection results. The open-source R package and MATLAB toolbox of the proposed method (named compressiveRDA) is freely available.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源