论文标题
在偏置方差权衡的下限
On lower bounds for the bias-variance trade-off
论文作者
论文摘要
对于高维和非参数统计模型,速率最佳估计器平衡平方偏差和方差的常见现象。尽管这种平衡是广泛观察到的,但鲜为人知的方法是否存在可以避免偏见和差异之间权衡的方法。我们提出了一种一般策略,以获取任何偏差小于预先指定的界限的估计器方差的下限。这表明不可避免的偏见差异权衡是在何种程度上,并允许量化不服从它的方法的性能损失。该方法基于许多抽象的下限,用于方差的许多抽象下限,涉及有关不同概率度量的期望变化以及信息度量,例如kullback-leibler或$χ^2 $ ddivergence。在本文的第二部分中,抽象的下限应用于几个统计模型,包括高斯白噪声模型,边界估计问题,高斯序列模型和高维线性回归模型。对于这些特定的统计应用,发生了不同类型的偏见变化权衡,其强度差异很大。对于高斯白噪声模型中综合平方偏置和综合差异之间的权衡,我们建议将下限的一般策略与还原技术结合在一起。这使我们能够将原始问题减少到具有更简单的统计模型中其他对称属性的估计器的偏置方差权衡处的下限。在高斯序列模型中,发生了偏置变化权衡的不同相变。尽管偏差和方差之间存在非平凡的相互作用,但是平方偏置的速率和方差不必平衡即可达到最小值估计率。
It is a common phenomenon that for high-dimensional and nonparametric statistical models, rate-optimal estimators balance squared bias and variance. Although this balancing is widely observed, little is known whether methods exist that could avoid the trade-off between bias and variance. We propose a general strategy to obtain lower bounds on the variance of any estimator with bias smaller than a prespecified bound. This shows to which extent the bias-variance trade-off is unavoidable and allows to quantify the loss of performance for methods that do not obey it. The approach is based on a number of abstract lower bounds for the variance involving the change of expectation with respect to different probability measures as well as information measures such as the Kullback-Leibler or $χ^2$-divergence. In a second part of the article, the abstract lower bounds are applied to several statistical models including the Gaussian white noise model, a boundary estimation problem, the Gaussian sequence model and the high-dimensional linear regression model. For these specific statistical applications, different types of bias-variance trade-offs occur that vary considerably in their strength. For the trade-off between integrated squared bias and integrated variance in the Gaussian white noise model, we propose to combine the general strategy for lower bounds with a reduction technique. This allows us to reduce the original problem to a lower bound on the bias-variance trade-off for estimators with additional symmetry properties in a simpler statistical model. In the Gaussian sequence model, different phase transitions of the bias-variance trade-off occur. Although there is a non-trivial interplay between bias and variance, the rate of the squared bias and the variance do not have to be balanced in order to achieve the minimax estimation rate.