论文标题
使用梯度编码和denoings通过不同的隐私来改善深度学习
Improving Deep Learning with Differential Privacy using Gradient Encoding and Denoising
论文作者
论文摘要
深度学习模型泄漏了有关其培训数据集的大量信息。先前的工作已经调查了具有不同隐私(DP)通过向梯度添加DP噪声保证的培训模型。但是,这种解决方案(特别是DPSGD)导致训练模型的准确性大大降低。在本文中,我们旨在培训DP保证的深度学习模型,同时比以前的工作更好地保留模型的准确性。我们的关键技术是编码梯度以将其映射到较小的向量空间,从而使我们能够为不同的噪声分布获得DP保证。这使我们可以调查和选择最能保留目标隐私预算模型准确性的噪声分布。我们还通过引入DeNoising的概念来利用差异隐私的后处理属性,从而进一步改善了受过训练的模型的实用性,而不会降低其DP保证。我们表明,我们的机制优于最先进的DPSGD;例如,对于MNIST的同一模型准确性$ 96.1 \%$,我们的技术会导致$ε= 3.2 $的隐私限制,而DPSGD的$ε= 6 $,这是一个重大改进。
Deep learning models leak significant amounts of information about their training datasets. Previous work has investigated training models with differential privacy (DP) guarantees through adding DP noise to the gradients. However, such solutions (specifically, DPSGD), result in large degradations in the accuracy of the trained models. In this paper, we aim at training deep learning models with DP guarantees while preserving model accuracy much better than previous works. Our key technique is to encode gradients to map them to a smaller vector space, therefore enabling us to obtain DP guarantees for different noise distributions. This allows us to investigate and choose noise distributions that best preserve model accuracy for a target privacy budget. We also take advantage of the post-processing property of differential privacy by introducing the idea of denoising, which further improves the utility of the trained models without degrading their DP guarantees. We show that our mechanism outperforms the state-of-the-art DPSGD; for instance, for the same model accuracy of $96.1\%$ on MNIST, our technique results in a privacy bound of $ε=3.2$ compared to $ε=6$ of DPSGD, which is a significant improvement.