论文标题
嘈杂插值的快速率需要重新考虑电感偏差的影响
Fast Rates for Noisy Interpolation Require Rethinking the Effects of Inductive Bias
论文作者
论文摘要
高维数据上的良好概括性能至关重要地取决于地面真理的简单结构和相应的估计器强度偏置。尽管这种直觉对于正规模型有效,但在本文中,我们警告在存在噪声的情况下对插值有强的感应偏置:虽然更强的感应偏见会促进更简单的结构,而更简单的结构与地面真相更加一致,但它也会增加噪声的不利影响。具体而言,对于稀疏地面真相的线性回归和分类,我们证明,最低$ \ ell_p $ - norm和最大$ \ ell_p $ -margin Interpolators实现了与$ P> 1 $相比的$ p> $ p = 1 $的快速多项式利率。最后,我们提供了初步的实验证据,即这种权衡也可能在理解实践中使用的非线性插值模型中起着至关重要的作用。
Good generalization performance on high-dimensional data crucially hinges on a simple structure of the ground truth and a corresponding strong inductive bias of the estimator. Even though this intuition is valid for regularized models, in this paper we caution against a strong inductive bias for interpolation in the presence of noise: While a stronger inductive bias encourages a simpler structure that is more aligned with the ground truth, it also increases the detrimental effect of noise. Specifically, for both linear regression and classification with a sparse ground truth, we prove that minimum $\ell_p$-norm and maximum $\ell_p$-margin interpolators achieve fast polynomial rates close to order $1/n$ for $p > 1$ compared to a logarithmic rate for $p = 1$. Finally, we provide preliminary experimental evidence that this trade-off may also play a crucial role in understanding non-linear interpolating models used in practice.