会员隐私与对抗性强大的学习之间的权衡

论文标题

会员隐私与对抗性强大的学习之间的权衡

Trade-offs between membership privacy & adversarially robust learning

论文作者

Hayes, Jamie

论文摘要

从历史上看，机器学习方法尚未考虑到安全性。反过来，这引起了对抗性示例，旨在在测试时间误导检测的精心扰动的输入样本，这些样本已应用于攻击垃圾邮件和恶意软件分类，以及最近攻击图像分类。因此，大量的研究致力于设计对对抗性实例的机器学习方法。不幸的是，除了坚固安全的机器学习模型必须满足的鲁棒性外，还有一些挑战，例如公平和隐私。 Song等人的最新工作。（2019年）在经验上表明，在健壮的机器学习模型和私人机器学习模型之间存在权衡。旨在对对抗性示例进行健壮的模型通常比标准（非运动）模型更大程度地过于拟合训练数据。如果数据集包含私人信息，则通过观察模型的输出来分开培训和测试数据的任何统计测试都可以代表隐私漏洞，如果模型过度适用于培训数据，则这些统计测试变得更加容易。在这项工作中，我们确定了与健壮模型相比，标准模型将在更大程度上过度拟合的设置，并且正如先前的作品中所观察到的，相反行为发生的设置。因此，不一定必须牺牲隐私才能实现鲁棒性。过度拟合的程度自然取决于可用于培训的数据量。我们继续表征训练尺寸因素如何通过在简单的高斯数据任务上训练强大的模型来暴露于隐私风险中，并从经验上证明我们对图像分类基准数据集（例如CIFAR-10和CIFAR-100）的发现。

Historically, machine learning methods have not been designed with security in mind. In turn, this has given rise to adversarial examples, carefully perturbed input samples aimed to mislead detection at test time, which have been applied to attack spam and malware classification, and more recently to attack image classification. Consequently, an abundance of research has been devoted to designing machine learning methods that are robust to adversarial examples. Unfortunately, there are desiderata besides robustness that a secure and safe machine learning model must satisfy, such as fairness and privacy. Recent work by Song et al. (2019) has shown, empirically, that there exists a trade-off between robust and private machine learning models. Models designed to be robust to adversarial examples often overfit on training data to a larger extent than standard (non-robust) models. If a dataset contains private information, then any statistical test that separates training and test data by observing a model's outputs can represent a privacy breach, and if a model overfits on training data, these statistical tests become easier. In this work, we identify settings where standard models will overfit to a larger extent in comparison to robust models, and as empirically observed in previous works, settings where the opposite behavior occurs. Thus, it is not necessarily the case that privacy must be sacrificed to achieve robustness. The degree of overfitting naturally depends on the amount of data available for training. We go on to characterize how the training set size factors into the privacy risks exposed by training a robust model on a simple Gaussian data task, and show empirically that our findings hold on image classification benchmark datasets, such as CIFAR-10 and CIFAR-100.

下载PDF全文

下载文献需遵守相关版权规定

论文标题