论文标题

部分可观测时空混沌系统的无模型预测

Robustly-reliable learners under poisoning attacks

论文作者

Balcan, Maria-Florina, Blum, Avrim, Hanneke, Steve, Sharma, Dravyansh

论文摘要

数据中毒攻击是对手损坏培训设置,目的是引起特定所需的错误,这引起了重大关注:即使只是这种攻击的可能性也可以使用户不再信任学习系统的结果。在这项工作中,我们展示了如何在面对多个轴上的这种攻击时实现强大的鲁棒性。 我们提供可靠的预测,其中预测标签是正确的,只要对手没有超过给定的腐败预算,即使在实例靶向攻击的情况下,对手就会提前知道测试示例并旨在在该示例上引起特定的失败。我们的保证比以前的方法要大得多,在先前的方法中,只能提供学习算法预测不会改变的证书,而不是证明预测是正确的,因为我们能够在工作中实现。值得注意的是,我们提供了在这种情况下的可学习性的完整表征,尤其是可以认证的区域上几乎匹配的上限和下限,以及用于计算该区域的有效算法,给定了ERM Oracle。此外,对于在LogConcave分布上的线性分离器的情况下,我们为这种可靠的可靠预测提供有效的真正多项式时间算法(即非轨道算法)。 我们还将这些结果扩展到了主动设置,在该设置中,算法自适应地询问了特定信息性示例的标签,而困难是,对手甚至可能适应这种相互作用,以及对于即使在未腐败数据上也没有完美分类器的不可知的学习环境。

Data poisoning attacks, in which an adversary corrupts a training set with the goal of inducing specific desired mistakes, have raised substantial concern: even just the possibility of such an attack can make a user no longer trust the results of a learning system. In this work, we show how to achieve strong robustness guarantees in the face of such attacks across multiple axes. We provide robustly-reliable predictions, in which the predicted label is guaranteed to be correct so long as the adversary has not exceeded a given corruption budget, even in the presence of instance targeted attacks, where the adversary knows the test example in advance and aims to cause a specific failure on that example. Our guarantees are substantially stronger than those in prior approaches, which were only able to provide certificates that the prediction of the learning algorithm does not change, as opposed to certifying that the prediction is correct, as we are able to achieve in our work. Remarkably, we provide a complete characterization of learnability in this setting, in particular, nearly-tight matching upper and lower bounds on the region that can be certified, as well as efficient algorithms for computing this region given an ERM oracle. Moreover, for the case of linear separators over logconcave distributions, we provide efficient truly polynomial time algorithms (i.e., non-oracle algorithms) for such robustly-reliable predictions. We also extend these results to the active setting where the algorithm adaptively asks for labels of specific informative examples, and the difficulty is that the adversary might even be adaptive to this interaction, as well as to the agnostic learning setting where there is no perfect classifier even over the uncorrupted data.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源