论文标题
通过非线性预测变量的受约束最小二乘和改进的次优。
Suboptimality of Constrained Least Squares and Improvements via Non-Linear Predictors
论文作者
论文摘要
我们研究了在平方损耗方面预测预测和最佳线性预测因子的问题。当假设数据生成分布的界限时,我们确定限制为有界欧几里得球的最小二乘估计器并不能达到经典的$ O(d/n)$多余的风险率,其中$ d $是协变量的尺寸,$ n $是样本的数量。特别是,我们构建一个有界的分布,以使受约束的最小二乘估计量产生$ω(d^{3/2}/n)$的多余风险,因此反驳了最近的猜想OHAD SHAMIR [JMLR 2015]。相比之下,我们观察到非线性预测变量可以实现最佳速率$ O(d/n)$,而没有关于协变量分布的假设。我们讨论了足以保证最小二乘估计器的$ O(d/n)$多余风险率的其他分配假设。其中包括在鲁棒统计文献中经常使用的某些时刻对等假设。尽管这种假设在分析无限制和重型尾部设置中至关重要,但我们的工作表明,在某些情况下,它们也排除了不利的界分布。
We study the problem of predicting as well as the best linear predictor in a bounded Euclidean ball with respect to the squared loss. When only boundedness of the data generating distribution is assumed, we establish that the least squares estimator constrained to a bounded Euclidean ball does not attain the classical $O(d/n)$ excess risk rate, where $d$ is the dimension of the covariates and $n$ is the number of samples. In particular, we construct a bounded distribution such that the constrained least squares estimator incurs an excess risk of order $Ω(d^{3/2}/n)$ hence refuting a recent conjecture of Ohad Shamir [JMLR 2015]. In contrast, we observe that non-linear predictors can achieve the optimal rate $O(d/n)$ with no assumptions on the distribution of the covariates. We discuss additional distributional assumptions sufficient to guarantee an $O(d/n)$ excess risk rate for the least squares estimator. Among them are certain moment equivalence assumptions often used in the robust statistics literature. While such assumptions are central in the analysis of unbounded and heavy-tailed settings, our work indicates that in some cases, they also rule out unfavorable bounded distributions.