GraphFl：用于半监督节点分类的联合学习框架

论文标题

GraphFl：用于半监督节点分类的联合学习框架

GraphFL: A Federated Learning Framework for Semi-Supervised Node Classification on Graphs

论文作者

Wang, Binghui, Li, Ang, Li, Hai, Chen, Yiran

论文摘要

基于图形的半监督节点分类（GraphSSC）具有广泛的应用程序，从网络和安全性到数据挖掘和机器学习等等等。但是，现有的集中式GraphsSc方法对于解决许多基于图形的问题是不切实际的，因为收集整个图形并收集了整个图形并标记了合理的标签数量，并且标签数量很时光和成本和成本范围，并且可能违反了数据，并且违反了数据。联合学习（FL）是一种新兴的学习范式，可以在多个客户之间进行协作学习，可以减轻标签稀缺性和保护数据隐私的问题。因此，在FL设置下执行GraphSSC是解决基于图形的问题的有希望的解决方案。但是，现有的FL方法1）当跨客户的数据是非IID，2）无法处理新标签域的数据，而3）无法利用未标记的数据，而所有这些问题自然发生在基于现实的图形问题中。为了解决上述问题，我们提出了第一个FL框架，即GraphFl，用于在图形上进行半监督节点分类。我们的框架是由元学习方法激励的。具体而言，我们提出了两种GraphFl方法，分别解决图形数据中的非IID问题，并使用新标签域处理任务。此外，我们设计了一种自我训练方法来利用未标记的图形数据。我们采用代表性图形神经网络作为GraphsSc方法，并在多个图数据集上评估GraphFL。实验结果表明，GraphFL显着胜过比较的基线和自我训练的GraphFL可以获得更好的性能。

Graph-based semi-supervised node classification (GraphSSC) has wide applications, ranging from networking and security to data mining and machine learning, etc. However, existing centralized GraphSSC methods are impractical to solve many real-world graph-based problems, as collecting the entire graph and labeling a reasonable number of labels is time-consuming and costly, and data privacy may be also violated. Federated learning (FL) is an emerging learning paradigm that enables collaborative learning among multiple clients, which can mitigate the issue of label scarcity and protect data privacy as well. Therefore, performing GraphSSC under the FL setting is a promising solution to solve real-world graph-based problems. However, existing FL methods 1) perform poorly when data across clients are non-IID, 2) cannot handle data with new label domains, and 3) cannot leverage unlabeled data, while all these issues naturally happen in real-world graph-based problems. To address the above issues, we propose the first FL framework, namely GraphFL, for semi-supervised node classification on graphs. Our framework is motivated by meta-learning methods. Specifically, we propose two GraphFL methods to respectively address the non-IID issue in graph data and handle the tasks with new label domains. Furthermore, we design a self-training method to leverage unlabeled graph data. We adopt representative graph neural networks as GraphSSC methods and evaluate GraphFL on multiple graph datasets. Experimental results demonstrate that GraphFL significantly outperforms the compared FL baseline and GraphFL with self-training can obtain better performance.

下载PDF全文

下载文献需遵守相关版权规定

论文标题