论文标题
垂直联合学习:挑战,方法和实验
Vertical Federated Learning: Challenges, Methodologies and Experiments
论文作者
论文摘要
最近,由于最终用户设备的计算和感应能力,联合学习(FL)已成为有前途的分布式机器学习(ML)技术,但是人们对用户隐私的担忧越来越多。作为FL的特殊架构,垂直FL(VFL)能够通过接管不同客户的子模型来构建超级ML模型。这些子模型是通过具有不同属性的垂直分区数据在本地训练的。因此,VFL的设计与常规FL的设计根本不同,这引发了新的独特的研究问题。在本文中,我们旨在通过有效的解决方案讨论VFL中的关键挑战,并在现实生活数据集上进行实验,以阐明这些问题。具体而言,我们首先在VFL上提出了一个通用框架,并突出了VFL和常规FL之间的关键差异。然后,我们讨论在四个方面扎根于VFL系统的研究挑战,即安全性和隐私风险,昂贵的计算和通信成本,模型拆分造成的可能的结构损害以及系统异质性。之后,我们开发了解决上述挑战的解决方案,并进行广泛的实验以展示我们提出的解决方案的有效性。
Recently, federated learning (FL) has emerged as a promising distributed machine learning (ML) technology, owing to the advancing computational and sensing capacities of end-user devices, however with the increasing concerns on users' privacy. As a special architecture in FL, vertical FL (VFL) is capable of constructing a hyper ML model by embracing sub-models from different clients. These sub-models are trained locally by vertically partitioned data with distinct attributes. Therefore, the design of VFL is fundamentally different from that of conventional FL, raising new and unique research issues. In this paper, we aim to discuss key challenges in VFL with effective solutions, and conduct experiments on real-life datasets to shed light on these issues. Specifically, we first propose a general framework on VFL, and highlight the key differences between VFL and conventional FL. Then, we discuss research challenges rooted in VFL systems under four aspects, i.e., security and privacy risks, expensive computation and communication costs, possible structural damage caused by model splitting, and system heterogeneity. Afterwards, we develop solutions to addressing the aforementioned challenges, and conduct extensive experiments to showcase the effectiveness of our proposed solutions.