论文标题

联合的双随机核学习用于垂直分区的数据

Federated Doubly Stochastic Kernel Learning for Vertically Partitioned Data

论文作者

Gu, Bin, Dang, Zhiyuan, Li, Xiang, Huang, Heng

论文摘要

在许多现实世界中的数据挖掘和机器学习应用程序中,数据由多个提供商提供,并且每个提供商都保留了有关共同实体不同功能集的私人记录。有效,有效地训练这些垂直分区的数据是一项挑战,同时为传统数据挖掘和机器学习算法保留数据隐私。在本文中,我们专注于与内核的非线性学习,并提出了用于垂直分区数据的联邦双重随机内核学习(FDSKL)算法。具体来说,我们使用随机功能来近似内核映射功能,并使用双重随机梯度来更新解决方案,这些解决方案均已联邦计算,而无需披露数据。重要的是,我们证明FDSKL具有子额收敛率,并且可以在半冬季假设下保证数据安全性。各种基准数据集的广泛实验结果表明,在与内核交易时,FDSKL明显比最新的联邦学习方法快得多,同时保留了类似的概括性能。

In a lot of real-world data mining and machine learning applications, data are provided by multiple providers and each maintains private records of different feature sets about common entities. It is challenging to train these vertically partitioned data effectively and efficiently while keeping data privacy for traditional data mining and machine learning algorithms. In this paper, we focus on nonlinear learning with kernels, and propose a federated doubly stochastic kernel learning (FDSKL) algorithm for vertically partitioned data. Specifically, we use random features to approximate the kernel mapping function and use doubly stochastic gradients to update the solutions, which are all computed federatedly without the disclosure of data. Importantly, we prove that FDSKL has a sublinear convergence rate, and can guarantee the data security under the semi-honest assumption. Extensive experimental results on a variety of benchmark datasets show that FDSKL is significantly faster than state-of-the-art federated learning methods when dealing with kernels, while retaining the similar generalization performance.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源