差异私人扩散模型

论文标题

差异私人扩散模型

Differentially Private Diffusion Models

论文作者

Dockhorn, Tim, Cao, Tianshi, Vahdat, Arash, Kreis, Karsten

论文摘要

尽管现代机器学习模型依赖于越来越大的培训数据集，但数据通常受到隐私敏感域的限制。在敏感数据上接受差异隐私（DP）训练的生成模型可以避免此挑战，而是提供对合成数据的访问。我们建立在扩散模型（DMS）的最新成功基础上，并引入了差异化私有扩散模型（DPDMS），该模型使用差异化的私有随机梯度下降（DP-SGD）实施隐私。我们研究了DM参数化和采样算法，这些算法原来是DPDMS中至关重要的成分，并提出了噪声多重性，这是针对DMS训练的DP-SGD的强大修饰。我们在图像生成基准测试中验证了新的DPDM，并在所有实验中实现最先进的性能。此外，在标准基准测试基准上，经过DPDM生成的合成数据训练的分类器与特定于任务的DP-SGD训练的分类器执行，该分类器以前尚未证明DP生成模型。项目页面和代码：https：//nv-tlabs.github.io/dpdm。

While modern machine learning models rely on increasingly large training datasets, data is often limited in privacy-sensitive domains. Generative models trained with differential privacy (DP) on sensitive data can sidestep this challenge, providing access to synthetic data instead. We build on the recent success of diffusion models (DMs) and introduce Differentially Private Diffusion Models (DPDMs), which enforce privacy using differentially private stochastic gradient descent (DP-SGD). We investigate the DM parameterization and the sampling algorithm, which turn out to be crucial ingredients in DPDMs, and propose noise multiplicity, a powerful modification of DP-SGD tailored to the training of DMs. We validate our novel DPDMs on image generation benchmarks and achieve state-of-the-art performance in all experiments. Moreover, on standard benchmarks, classifiers trained on DPDM-generated synthetic data perform on par with task-specific DP-SGD-trained classifiers, which has not been demonstrated before for DP generative models. Project page and code: https://nv-tlabs.github.io/DPDM.

下载PDF全文

下载文献需遵守相关版权规定

论文标题