论文标题
基于结构聚类的自我监督的异质图预训练
Self-supervised Heterogeneous Graph Pre-training Based on Structural Clustering
论文作者
论文摘要
在传统的半监督异质图神经网络(HGNNS)上,最近对异质信息网络(HINS)的自我监督的预训练方法(HINS)表现出了有希望的竞争力。不幸的是,他们的性能在很大程度上取决于仔细定制各种策略来产生高质量的积极例子和负面例子,这特别限制了它们的灵活性和概括能力。在这项工作中,我们提出了SHGP,这是一种新型的自我监督的异质图预训练方法,不需要产生任何积极的例子或负面例子。它由两个共享相同注意聚集方案的模块组成。在每次迭代中,ATT-LPA模块都通过结构聚类产生伪标记,这些聚类用作自学信号,以指导ATT-HGNN模块学习对象嵌入和注意力系数。这两个模块可以有效地利用和增强对方,从而促进模型以学习歧视性嵌入。在四个现实世界数据集上进行了广泛的实验,证明了SHGP针对最先进的无监督基线甚至半监督基线的效率。我们在以下网址发布我们的源代码:https://github.com/kepsail/shgp。
Recent self-supervised pre-training methods on Heterogeneous Information Networks (HINs) have shown promising competitiveness over traditional semi-supervised Heterogeneous Graph Neural Networks (HGNNs). Unfortunately, their performance heavily depends on careful customization of various strategies for generating high-quality positive examples and negative examples, which notably limits their flexibility and generalization ability. In this work, we present SHGP, a novel Self-supervised Heterogeneous Graph Pre-training approach, which does not need to generate any positive examples or negative examples. It consists of two modules that share the same attention-aggregation scheme. In each iteration, the Att-LPA module produces pseudo-labels through structural clustering, which serve as the self-supervision signals to guide the Att-HGNN module to learn object embeddings and attention coefficients. The two modules can effectively utilize and enhance each other, promoting the model to learn discriminative embeddings. Extensive experiments on four real-world datasets demonstrate the superior effectiveness of SHGP against state-of-the-art unsupervised baselines and even semi-supervised baselines. We release our source code at: https://github.com/kepsail/SHGP.