论文标题
不可知论从最坏的晶格问题中学习半空间的硬度
Hardness of Agnostically Learning Halfspaces from Worst-Case Lattice Problems
论文作者
论文摘要
我们在不依赖分布独立的和特定于分布的设置中表现出不当学习半空间的硬度,基于以下假设,即最严重的晶格问题(例如GAPSVP或SIVP)都是困难的。特别是,我们表明,在此假设下,没有有效的算法可以输出任何二进制假设,不一定是半空间,即使最佳错误分类误差也小于$δ$,也比$ \ frac 12-γ$更好地达到错误分类误差。在这里,$γ$可以小于尺寸中任何多项式的倒数,而$δ$小至$ exp(-ch(\ log^{1-c}(d)(d)(d)(d)(d))))$,其中$ 0 <c <1 $是任意常数,$ d $是尺寸。对于特定于分布的设置,我们表明,如果边际分布是标准的高斯,对于任何$β> 0 $学习的半空间,直到错误$ opt_ {ltf} +ε$在相同的硬度假设下至少需要$ d^{\tildeΩ(\tildeΩ(1/ε^{2-β})} $。同样,我们证明学习度 - $ \ ell $ polyenmial阈值函数到错误$ opt _ {{ptf} _ \ ell} +ε$,至少需要$ d^{\tildeΩ(\ ell^eell^{2-β}/ε^{2--β}})} $。 $ opt_ {ltf} $和$ opt _ {{ptf} _ \ ell} $分别表示任何半空间或多项式阈值函数可实现的最佳错误。 我们的下限基于算法的保证,并(几乎)基于非智障假设恢复已知的下限。以前,这种硬度结果[Daniely16,dkpz21]基于平均案例复杂性假设或仅限于统计查询模型。我们的工作给基于这些基本学习问题的第一个硬度结果带来了最严重的复杂性假设。它的灵感来自最近的一系列作品,显示了基于最坏情况的晶格问题学习良好分离的高斯混合物的硬度。
We show hardness of improperly learning halfspaces in the agnostic model, both in the distribution-independent as well as the distribution-specific setting, based on the assumption that worst-case lattice problems, such as GapSVP or SIVP, are hard. In particular, we show that under this assumption there is no efficient algorithm that outputs any binary hypothesis, not necessarily a halfspace, achieving misclassfication error better than $\frac 1 2 - γ$ even if the optimal misclassification error is as small is as small as $δ$. Here, $γ$ can be smaller than the inverse of any polynomial in the dimension and $δ$ as small as $exp(-Ω(\log^{1-c}(d)))$, where $0 < c < 1$ is an arbitrary constant and $d$ is the dimension. For the distribution-specific setting, we show that if the marginal distribution is standard Gaussian, for any $β> 0$ learning halfspaces up to error $OPT_{LTF} + ε$ takes time at least $d^{\tildeΩ(1/ε^{2-β})}$ under the same hardness assumptions. Similarly, we show that learning degree-$\ell$ polynomial threshold functions up to error $OPT_{{PTF}_\ell} + ε$ takes time at least $d^{\tildeΩ(\ell^{2-β}/ε^{2-β})}$. $OPT_{LTF}$ and $OPT_{{PTF}_\ell}$ denote the best error achievable by any halfspace or polynomial threshold function, respectively. Our lower bounds qualitively match algorithmic guarantees and (nearly) recover known lower bounds based on non-worst-case assumptions. Previously, such hardness results [Daniely16, DKPZ21] were based on average-case complexity assumptions or restricted to the statistical query model. Our work gives the first hardness results basing these fundamental learning problems on worst-case complexity assumptions. It is inspired by a sequence of recent works showing hardness of learning well-separated Gaussian mixtures based on worst-case lattice problems.