稳健的M估计值的样本外误差估计值

论文标题

稳健的M估计值的样本外误差估计值

Out-of-sample error estimate for robust M-estimators with convex penalty

论文作者

Bellec, Pierre C

论文摘要

提出了针对强大的$ m $估计量的通用样本误差估算，并在高维线性回归中定期使用凸面罚款，其中观察到$（x，y）$，$ p，n $的订单相同。如果$ψ$是可靠的数据拟合损失$ρ$的导数，则估计仅通过数量$ \ hatψ=ψ（y-x \hatβ）$，$ x^\ top \hatψ$和衍生品$（\ partial/\ partial/\ partial y hat y）$（$ y/$）y/$ y/y/y/y/y/y/y/y/y/y/y/y/y/y/y/y/y/y/y/y/y/y/y/y/y/y/y/y/y/y/y/y/y/y/y/y/y/y/y/y/y/y/y/y/y/ 样本外误差估计在线性模型中具有$ n^{ - 1/2} $的相对误差，该模型具有高斯协变量和独立的噪声，当$ p/n \leγ$或在高维度的渐近级别$ p/n \ toc/n \ p/n \ tocy $ p/n \ toctopotic $ p/n \ toctopotic $ p/n \ gtocy p/n \ commptimation n \ in（0（0）中，当时是非矩阵的。只要$ψ=ρ'$是1-lipschitz，允许使用一般可区分的损失功能$ρ$。样本外误差估计的有效性是在强大的凸度假设下，或者对于$ \ ell_1 $ penalalizatization的huber m估计器，如果损坏的观察值和真实$β$的损坏的数量和$β$的稀疏度在上面的$ s_*n $中，对于某些很小的常数$ s_*n $，则是$ s_*n $的$ s_*n $，对于某些很小的常数$ s _**\ in（0,0,1,1,1,1，1,1）$ n n n n，，1,1）$ n n n n，，1,1,1）p by。对于正方形损失，在响应中没有损坏的情况下，结果还产生了$ n^{ - 1/2} $ - 一致的噪声方差和概括误差的一致估计。这将概括为任意凸的惩罚，估计以前以套索而闻名。

A generic out-of-sample error estimate is proposed for robust $M$-estimators regularized with a convex penalty in high-dimensional linear regression where $(X,y)$ is observed and $p,n$ are of the same order. If $ψ$ is the derivative of the robust data-fitting loss $ρ$, the estimate depends on the observed data only through the quantities $\hatψ= ψ(y-X\hatβ)$, $X^\top \hatψ$ and the derivatives $(\partial/\partial y) \hatψ$ and $(\partial/\partial y) X\hatβ$ for fixed $X$. The out-of-sample error estimate enjoys a relative error of order $n^{-1/2}$ in a linear model with Gaussian covariates and independent noise, either non-asymptotically when $p/n\le γ$ or asymptotically in the high-dimensional asymptotic regime $p/n\toγ'\in(0,\infty)$. General differentiable loss functions $ρ$ are allowed provided that $ψ=ρ'$ is 1-Lipschitz. The validity of the out-of-sample error estimate holds either under a strong convexity assumption, or for the $\ell_1$-penalized Huber M-estimator if the number of corrupted observations and sparsity of the true $β$ are bounded from above by $s_*n$ for some small enough constant $s_*\in(0,1)$ independent of $n,p$. For the square loss and in the absence of corruption in the response, the results additionally yield $n^{-1/2}$-consistent estimates of the noise variance and of the generalization error. This generalizes, to arbitrary convex penalty, estimates that were previously known for the Lasso.

下载PDF全文

下载文献需遵守相关版权规定

论文标题