论文标题
与差异隐私的联合随机原始学习
Federated Stochastic Primal-dual Learning with Differential Privacy
论文作者
论文摘要
联合学习(FL)是一个新的范式,它使许多客户能够在参数服务器的编排下共同训练机器学习(ML)模型,同时使本地数据未暴露于任何第三方。但是,FL的培训是本地客户端和参数服务器之间的交互过程。此类过程将导致隐私泄漏,因为对手可以通过分析偷听消息来检索敏感信息。在本文中,我们提出了一种具有差异隐私(FEDSPD-DP)的新联合随机二线算法。与现有方法相比,提议的FEDSPD-DP结合了本地随机梯度下降(本地SGD)和部分客户参与(PCP),以解决由于随机访问的客户而引起的通信效率和Straggler效应问题。我们的分析表明,数据采样策略和PCP可以增强数据隐私,而较大的本地SGD步骤可能会增加隐私泄漏,从而揭示算法通信效率和隐私保护之间的非平凡权衡。具体而言,我们表明,通过保证每次通信的每个客户保证$(ε,δ)$ - dp,提议的算法可以保证$(\ Mathcal {o}(qε\ sqrt {p t}),δ)$ - dp - $ t $ t $后$ \ t $交流后,同时维持$ \ nath $ \ nath $ \ nath $ \ nathcal $ clcal $} $ {o} o {o}(o} o}凸和非平滑学习问题的费率,其中$ q $是本地SGD步骤的数量,$ p $是客户端采样概率,$ q = \ max_ {i} q_i/\ sqrt {1-q_i} $,$ q_i $ and $ q_i $是PCP下每个客户端的数据采样概率。提出了实验结果,以评估所提出算法的实际性能并与最新方法进行比较。
Federated learning (FL) is a new paradigm that enables many clients to jointly train a machine learning (ML) model under the orchestration of a parameter server while keeping the local data not being exposed to any third party. However, the training of FL is an interactive process between local clients and the parameter server. Such process would cause privacy leakage since adversaries may retrieve sensitive information by analyzing the overheard messages. In this paper, we propose a new federated stochastic primal-dual algorithm with differential privacy (FedSPD-DP). Compared to the existing methods, the proposed FedSPD-DP incorporates local stochastic gradient descent (local SGD) and partial client participation (PCP) for addressing the issues of communication efficiency and straggler effects due to randomly accessed clients. Our analysis shows that the data sampling strategy and PCP can enhance the data privacy whereas the larger number of local SGD steps could increase privacy leakage, revealing a non-trivial tradeoff between algorithm communication efficiency and privacy protection. Specifically, we show that, by guaranteeing $(ε, δ)$-DP for each client per communication round, the proposed algorithm guarantees $(\mathcal{O}(qε\sqrt{p T}), δ)$-DP after $T$ communication rounds while maintaining an $\mathcal{O}(1/\sqrt{pTQ})$ convergence rate for a convex and non-smooth learning problem, where $Q$ is the number of local SGD steps, $p$ is the client sampling probability, $q=\max_{i} q_i/\sqrt{1-q_i}$ and $q_i$ is the data sampling probability of each client under PCP. Experiment results are presented to evaluate the practical performance of the proposed algorithm and comparison with state-of-the-art methods.