论文标题
高维区块线性回归中的模型选择
Model Selection in High-Dimensional Block-Sparse Linear Regression
论文作者
论文摘要
模型选择是数据分析不可或缺的一部分,该数据分析经常处理拟合和预测目的。在本文中,我们在一般线性回归中解决了模型选择问题,其中参数矩阵具有块状结构,即非零条目出现在群集或块中,并且与参数维度相比,此类非零块的数量非常小。此外,与可用测量数量相比,参数维度相当大的情况下,考虑了高维设置。为了在此设置中执行模型选择,我们提出了一个信息标准,该信息标准是扩展贝叶斯信息准则(EBIC-R)的概括,并考虑了块结构和高维场景。提供了用于此设置的EBIC-R的分析步骤。仿真结果表明,所提出的方法的性能要比现有的最新方法要好得多,并在大型样本量和/或高-SNR处实现经验一致性。
Model selection is an indispensable part of data analysis dealing very frequently with fitting and prediction purposes. In this paper, we tackle the problem of model selection in a general linear regression where the parameter matrix possesses a block-sparse structure, i.e., the non-zero entries occur in clusters or blocks and the number of such non-zero blocks is very small compared to the parameter dimension. Furthermore, a high-dimensional setting is considered where the parameter dimension is quite large compared to the number of available measurements. To perform model selection in this setting, we present an information criterion that is a generalization of the Extended Bayesian Information Criterion-Robust (EBIC-R) and it takes into account both the block structure and the high-dimensionality scenario. The analytical steps for deriving the EBIC-R for this setting are provided. Simulation results show that the proposed method performs considerably better than the existing state-of-the-art methods and achieves empirical consistency at large sample sizes and/or at high-SNR.