论文标题
隐私游戏:在隐私限制下建立更好的联合平台协作
Game of Privacy: Towards Better Federated Platform Collaboration under Privacy Restriction
论文作者
论文摘要
垂直联合学习(VFL)旨在从跨索洛数据训练模型,该数据具有存储在不同平台上的不同特征空间。现有的VFL方法通常假设每个平台上的所有数据都可以用于模型培训。但是,由于联合学习的内在隐私风险,可能会限制所涉及的数据的总量。此外,现有的VFL研究通常假定一个平台具有任务标签,并且可以从协作中受益,从而使其他平台很难加入协作学习。在本文中,我们研究了在隐私限制下在VFL中的平台协作问题。我们建议通过互惠协作来启发不同的平台,在该平台中,所有平台都可以在VFL框架中利用多平台信息以使其自己的任务受益。凭借有限的隐私预算,每个平台都需要明智地将其数据配额分配给与其他平台的协作。因此,它们自然会形成多方游戏。该游戏中有两个核心问题,即如何评估其他平台的数据值来计算游戏奖励以及如何优化解决游戏的策略。为了评估其他平台数据的贡献,每个平台提供少量的“存款”数据以参与VFL。我们提出了一种性能估计方法,以预测平台间数据的不同量组合时预测预期的模型性能。为了解决游戏,我们提出了一种平台谈判方法,该方法模拟了平台之间的讨价还价,并通过梯度下降在本地优化其政策。在两个现实世界数据集上进行的广泛实验表明,我们的方法可以有效地促进在隐私限制下对VFL中多平台数据的协作开发。
Vertical federated learning (VFL) aims to train models from cross-silo data with different feature spaces stored on different platforms. Existing VFL methods usually assume all data on each platform can be used for model training. However, due to the intrinsic privacy risks of federated learning, the total amount of involved data may be constrained. In addition, existing VFL studies usually assume only one platform has task labels and can benefit from the collaboration, making it difficult to attract other platforms to join in the collaborative learning. In this paper, we study the platform collaboration problem in VFL under privacy constraint. We propose to incent different platforms through a reciprocal collaboration, where all platforms can exploit multi-platform information in the VFL framework to benefit their own tasks. With limited privacy budgets, each platform needs to wisely allocate its data quotas for collaboration with other platforms. Thereby, they naturally form a multi-party game. There are two core problems in this game, i.e., how to appraise other platforms' data value to compute game rewards and how to optimize policies to solve the game. To evaluate the contributions of other platforms' data, each platform offers a small amount of "deposit" data to participate in the VFL. We propose a performance estimation method to predict the expected model performance when involving different amount combinations of inter-platform data. To solve the game, we propose a platform negotiation method that simulates the bargaining among platforms and locally optimizes their policies via gradient descent. Extensive experiments on two real-world datasets show that our approach can effectively facilitate the collaborative exploitation of multi-platform data in VFL under privacy restrictions.