论文标题
分布式DP-helmet:单层的可扩展性私有非交互平均
Distributed DP-Helmet: Scalable Differentially Private Non-interactive Averaging of Single Layers
论文作者
论文摘要
在这项工作中,我们在一个称为分布式DP螺旋的框架中提出了两个私人,非交互式的分布式学习算法。我们的框架是基于我们对盲人的造成的平均值:每个用户在本地学习并发出模型,然后所有用户通过安全的求和协议共同计算其模型的平均值。 We provide experimental evidence that blind averaging for SVMs and single Softmax-layer (Softmax-SLP) can have a strong utility-privacy tradeoff: we reach an accuracy of 86% on CIFAR-10 for $\varepsilon$ = 0.4 and 1,000 users, of 44% on CIFAR-100 for $\varepsilon$ = 1.2 and 100 users, and of 39% on federated EMNIST for $ \ varepsilon $ = 0.4和3,400个用户,所有这些用户都是基于SIMCLR的预处理后的。作为消融,我们研究了我们对强烈非IID环境的方法的弹性。从理论方面来说,我们表明,如果目标函数光滑,Lipschitz,并且像SVM一样强烈凸出,则盲目的平均差异隐私。我们表明,这些属性也适用于SoftMax-SLP,通常用于最后层微型调整,因此对于固定模型大小,offmmax-slp的隐私约束$ \ varepsilon $不再取决于类的数量。这标志着SoftMax-SLP的效用和隐私在SVM上的重要优势。此外,在极限中,铰链损坏SVM收敛到集中式学习的SVM的平均。后一个结果基于代表定理,可以看作是为其他经验风险最小化器(ERM)(例如SoftMax-SLP)找到收敛的蓝图。
In this work, we propose two differentially private, non-interactive, distributed learning algorithms in a framework called Distributed DP-Helmet. Our framework is based on what we coin blind averaging: each user locally learns and noises a model and all users then jointly compute the mean of their models via a secure summation protocol. We provide experimental evidence that blind averaging for SVMs and single Softmax-layer (Softmax-SLP) can have a strong utility-privacy tradeoff: we reach an accuracy of 86% on CIFAR-10 for $\varepsilon$ = 0.4 and 1,000 users, of 44% on CIFAR-100 for $\varepsilon$ = 1.2 and 100 users, and of 39% on federated EMNIST for $\varepsilon$ = 0.4 and 3,400 users, all after a SimCLR-based pretraining. As an ablation, we study the resilience of our approach to a strongly non-IID setting. On the theoretical side, we show that blind averaging preserves differential privacy if the objective function is smooth, Lipschitz, and strongly convex like SVMs. We show that these properties also hold for Softmax-SLP which is often used for last-layer fine-tuning such that for a fixed model size the privacy bound $\varepsilon$ of Softmax-SLP no longer depends on the number of classes. This marks a significant advantage in utility and privacy of Softmax-SLP over SVMs. Furthermore, in the limit blind averaging of hinge-loss SVMs convergences to a centralized learned SVM. The latter result is based on the representer theorem and can be seen as a blueprint for finding convergence for other empirical risk minimizers (ERM) like Softmax-SLP.