论文标题

防御Rényi差异隐私防止重建攻击

Defending against Reconstruction Attacks with Rényi Differential Privacy

论文作者

Stock, Pierre, Shilov, Igor, Mironov, Ilya, Sablayrolles, Alexandre

论文摘要

重建攻击使对手可以使用训练有素的模型再生训练集的数据样本。最近已经显示,简单的启发式方法可以从语言模型中重建数据样本,从而使这种威胁场景成为模型发布的重要方面。差异隐私是对此类攻击的已知解决方案,但通常与相对较大的隐私预算(Epsilon> 8)一起使用,该预算并不能转化为有意义的保证。在本文中,我们表明,为了采用相同的机制,我们可以为重建攻击提供比文献中传统攻击更好的重建攻击的隐私保证。特别是,我们表明,较大的隐私预算不能防止会员推理,但仍然可以保护提取罕见秘密。我们通过实验表明,我们的保证符合各种语言模型,包括Wikitext-103上的GPT-2填充。

Reconstruction attacks allow an adversary to regenerate data samples of the training set using access to only a trained model. It has been recently shown that simple heuristics can reconstruct data samples from language models, making this threat scenario an important aspect of model release. Differential privacy is a known solution to such attacks, but is often used with a relatively large privacy budget (epsilon > 8) which does not translate to meaningful guarantees. In this paper we show that, for a same mechanism, we can derive privacy guarantees for reconstruction attacks that are better than the traditional ones from the literature. In particular, we show that larger privacy budgets do not protect against membership inference, but can still protect extraction of rare secrets. We show experimentally that our guarantees hold against various language models, including GPT-2 finetuned on Wikitext-103.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源