论文标题

个性化联合学习的团体隐私

Group privacy for personalized federated learning

论文作者

Galli, Filippo, Biswas, Sayan, Jung, Kangsoo, Cucinotta, Tommaso, Palamidessi, Catuscia

论文摘要

联合学习(FL)是一种协作机器学习,参与同行/客​​户在本地处理他们的数据,仅分享与协作模型的更新。这使得能够建立隐私感知的分布式机器学习模型。目的是通过最大程度地降低一组客户本地存储的数据集的成本函数来优化统计模型的参数。这个过程使客户遇到了两个问题:私人信息的泄漏和模型的个性化缺乏。另一方面,随着各种技术的最新进步来分析数据,人们对侵犯参与客户的隐私行为的关注激增。为了减轻这种情况,差异隐私及其变体是提供正式隐私保证的标准。客户通常代表非常异构的社区,并拥有非常多样化的数据。因此,与FL社区的最新重点保持一致,以为代表其多样性的用户建立个性化模型框架,这对于保护客户的敏感和个人信息免受潜在威胁而言也是至关重要的。为了解决这个目标,我们考虑使用基于公制的混淆技术,它是当地差异隐私的变体,它是$ d $ - 特权,也称为度量隐私,它是局部差异隐私的变体,该技术保留了原始数据的拓扑分布。为了应对保护客户隐私并允许个性化模型培训以增强系统的公平性和实用性的问题,我们提出了一种提供群体隐私的方法,可以保证利用$ d $ privacy的某些关键特性,从而在FL的框架下实现个性化模型。我们为实际数据集的适用性和实验验证提供了理论上的理由,以说明我们方法的工作。

Federated learning (FL) is a type of collaborative machine learning where participating peers/clients process their data locally, sharing only updates to the collaborative model. This enables to build privacy-aware distributed machine learning models, among others. The goal is the optimization of a statistical model's parameters by minimizing a cost function of a collection of datasets which are stored locally by a set of clients. This process exposes the clients to two issues: leakage of private information and lack of personalization of the model. On the other hand, with the recent advancements in various techniques to analyze data, there is a surge of concern for the privacy violation of the participating clients. To mitigate this, differential privacy and its variants serve as a standard for providing formal privacy guarantees. Often the clients represent very heterogeneous communities and hold data which are very diverse. Therefore, aligned with the recent focus of the FL community to build a framework of personalized models for the users representing their diversity, it is also of utmost importance to protect the clients' sensitive and personal information against potential threats. To address this goal we consider $d$-privacy, also known as metric privacy, which is a variant of local differential privacy, using a a metric-based obfuscation technique that preserves the topological distribution of the original data. To cope with the issue of protecting the privacy of the clients and allowing for personalized model training to enhance the fairness and utility of the system, we propose a method to provide group privacy guarantees exploiting some key properties of $d$-privacy which enables personalized models under the framework of FL. We provide theoretical justifications to the applicability and experimental validation on real datasets to illustrate the working of our method.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源