在垂直分区数据上的混合差异化联合学习

论文标题

在垂直分区数据上的混合差异化联合学习

Hybrid Differentially Private Federated Learning on Vertically Partitioned Data

论文作者

Wang, Chang, Liang, Jian, Huang, Mingkai, Bai, Bing, Bai, Kun, Li, Hao

论文摘要

我们提出了HDP-VFL，这是第一个用于垂直联合学习（VFL）的混合差异私有（DP）框架，以证明可以从垂直分配的数据中共同学习一般性的线性模型（GLM），只有可忽略不计的成本，W.R.T。与理想化的非私人VFL相比，训练时间，准确性等。我们的工作是基于不同组织之间基于VFL的协作培训的最新进展，这些组织依赖于诸如同型加密（HE）和确保多方计算（MPC）之类的协议以确保计算和培训。特别是，我们分析了VFL的中级结果（IR）如何在通信和设计基于DP的隐私保护算法的过程中泄露培训数据的私人信息，以确保VFL参与者的数据机密性。我们从数学上证明我们的算法不仅为VFL提供了实用性保证，还提供了多层次的隐私，即DP W.R.T. IR和联合差异隐私（JDP）W.R.T.模型重量。实验结果表明，在适当的隐私预算下，我们的工作在定量和质量上与GLM相似，在理想化的非私人VFL设置中学到了，而不是基于HE或MPC的大多数先前工作中的内存和处理时间成本增加。如果接受本文，我们的代码将发布。

We present HDP-VFL, the first hybrid differentially private (DP) framework for vertical federated learning (VFL) to demonstrate that it is possible to jointly learn a generalized linear model (GLM) from vertically partitioned data with only a negligible cost, w.r.t. training time, accuracy, etc., comparing to idealized non-private VFL. Our work builds on the recent advances in VFL-based collaborative training among different organizations which rely on protocols like Homomorphic Encryption (HE) and Secure Multi-Party Computation (MPC) to secure computation and training. In particular, we analyze how VFL's intermediate result (IR) can leak private information of the training data during communication and design a DP-based privacy-preserving algorithm to ensure the data confidentiality of VFL participants. We mathematically prove that our algorithm not only provides utility guarantees for VFL, but also offers multi-level privacy, i.e. DP w.r.t. IR and joint differential privacy (JDP) w.r.t. model weights. Experimental results demonstrate that our work, under adequate privacy budgets, is quantitatively and qualitatively similar to GLMs, learned in idealized non-private VFL setting, rather than the increased cost in memory and processing time in most prior works based on HE or MPC. Our codes will be released if this paper is accepted.

下载PDF全文

下载文献需遵守相关版权规定

论文标题