通过正方和的强大稀疏平均估计

论文标题

通过正方和的强大稀疏平均估计

Robust Sparse Mean Estimation via Sum of Squares

论文作者

Diakonikolas, Ilias, Kane, Daniel M., Karmalkar, Sushrut, Pensia, Ankit, Pittas, Thanasis

论文摘要

我们研究了在存在$ε$ - 对抗异常值的存在下，高维稀疏平均估计的问题。先前的工作为此任务获得了该任务的样本和计算有效的算法，以供身份协方差Subgaussian分布。在这项工作中，我们开发了第一个有效的算法，用于稳健的稀疏平均估计，而没有对协方差的先验知识。对于$ \ Mathbb r^d $上的分布，并带有“认证” $ t $ T $ - 瞬间和足够轻的尾巴，我们的算法达到了$ O（ε^{1-1/T}）$的错误，带有样品复杂性$ M =（k \ log log（k \ log（k \ log（d））^{o（t）}}/i^^$^2-2-2-2-2-2-2-2-2-2-2-2-2-2-2-2对于高斯分布的特殊情况，我们的算法具有$ \ tilde o（ε）$的接近最佳误差，带有样品复杂性$ m = o（k^4 \ mathrm {polylog}（d）（d））/ε^2 $。我们的算法遵循基于平方的总和，对算法方法的证明。我们通过统计查询和低度多项式测试的下限对上限进行补充，提供了证据，表明我们算法实现的样本时间 - 错误权衡在定性上是最好的。

We study the problem of high-dimensional sparse mean estimation in the presence of an $ε$-fraction of adversarial outliers. Prior work obtained sample and computationally efficient algorithms for this task for identity-covariance subgaussian distributions. In this work, we develop the first efficient algorithms for robust sparse mean estimation without a priori knowledge of the covariance. For distributions on $\mathbb R^d$ with "certifiably bounded" $t$-th moments and sufficiently light tails, our algorithm achieves error of $O(ε^{1-1/t})$ with sample complexity $m = (k\log(d))^{O(t)}/ε^{2-2/t}$. For the special case of the Gaussian distribution, our algorithm achieves near-optimal error of $\tilde O(ε)$ with sample complexity $m = O(k^4 \mathrm{polylog}(d))/ε^2$. Our algorithms follow the Sum-of-Squares based, proofs to algorithms approach. We complement our upper bounds with Statistical Query and low-degree polynomial testing lower bounds, providing evidence that the sample-time-error tradeoffs achieved by our algorithms are qualitatively the best possible.

下载PDF全文

下载文献需遵守相关版权规定

论文标题