关于局部感知器的力量，用于对对抗噪声的半空间的标签 - 最佳学习

论文标题

关于局部感知器的力量，用于对对抗噪声的半空间的标签 - 最佳学习

On the Power of Localized Perceptron for Label-Optimal Learning of Halfspaces with Adversarial Noise

论文作者

Shen, Jie

论文摘要

我们在$ \ mathbb {r}^d $中研究{\ em Online}与对抗噪声中的同质半空间的积极学习，其中嘈杂标签的总体概率被限制为最多最多为$ν$。 Our main contribution is a Perceptron-like online active learning algorithm that runs in polynomial time, and under the conditions that the marginal distribution is isotropic log-concave and $ν= Ω(ε)$, where $ε\in (0, 1)$ is the target error rate, our algorithm PAC learns the underlying halfspace with near-optimal label complexity of $ \ tilde {o} \ big（d \ cdot polylog（\ frac {1}ε）\ big）$和$ \ tilde {o}的样品复杂性{o} \ big（\ frac {d} d}ε\ big）$。在这项工作之前，旨在耐受对抗性噪声的现有在线算法要么在$ \ frac {1}ε$或次优的噪声耐受性或限制性边缘分布中受到标签复杂性多项式。有了其他先验的知识，即基础半空间是$ s $ -sparse，我们获得了属性效率标签的复杂性，为$ \ tilde {o} \ big（s \ cdot polylog（d，\ frac {1}}}} polylog（\ frac {1}ε）\ big）$和样品的复杂性，$ \ tilde的复杂性polylog（d）\ big）$。作为直接推论，我们表明，在不可知的模型下，没有对噪声率$ν$做出任何假设，我们的活跃学习者的错误率是$ O（OPT） +ε$，具有相同的运行时间，标签和样品复杂性，其中$ opt $ opt $是任何同型半空间可达到的最佳错误率。

We study {\em online} active learning of homogeneous halfspaces in $\mathbb{R}^d$ with adversarial noise where the overall probability of a noisy label is constrained to be at most $ν$. Our main contribution is a Perceptron-like online active learning algorithm that runs in polynomial time, and under the conditions that the marginal distribution is isotropic log-concave and $ν= Ω(ε)$, where $ε\in (0, 1)$ is the target error rate, our algorithm PAC learns the underlying halfspace with near-optimal label complexity of $\tilde{O}\big(d \cdot polylog(\frac{1}ε)\big)$ and sample complexity of $\tilde{O}\big(\frac{d}ε \big)$. Prior to this work, existing online algorithms designed for tolerating the adversarial noise are subject to either label complexity polynomial in $\frac{1}ε$, or suboptimal noise tolerance, or restrictive marginal distributions. With the additional prior knowledge that the underlying halfspace is $s$-sparse, we obtain attribute-efficient label complexity of $\tilde{O}\big( s \cdot polylog(d, \frac{1}ε) \big)$ and sample complexity of $\tilde{O}\big(\frac{s}ε \cdot polylog(d) \big)$. As an immediate corollary, we show that under the agnostic model where no assumption is made on the noise rate $ν$, our active learner achieves an error rate of $O(OPT) + ε$ with the same running time and label and sample complexity, where $OPT$ is the best possible error rate achievable by any homogeneous halfspace.

下载PDF全文

下载文献需遵守相关版权规定

论文标题