通过隐式利用注意力和元数据，用于HIN的GCN

论文标题

通过隐式利用注意力和元数据，用于HIN的GCN

GCN for HIN via Implicit Utilization of Attention and Meta-paths

论文作者

Jin, Di, Yu, Zhizhi, He, Dongxiao, Yang, Carl, Yu, Philip S., Han, Jiawei

论文摘要

异质信息网络（HIN）嵌入，旨在将HIN中的结构和语义信息映射到分布式表示形式，引起了大量的研究注意。用于HIN嵌入的图形神经网络通常会采用分层注意力（包括节点级别和元路径级的关注），以捕获基于元路径的邻居的信息。但是，由于严重的过度拟合，这种复杂的注意力结构通常无法实现选择元路径的功能。此外，在传播信息时，这些方法不会将直接（单跳）元路径与间接（多跳）区分开。但是从网络科学的角度来看，通常认为直接关系更重要，只能用于建模直接信息传播。为了解决这些局限性，我们通过隐式利用注意力和元路径提出了一种新型的神经网络方法，这可以缓解当前过度参数过度的注意机制在HIN上引起的严重过度拟合。我们首先使用多层图卷积网络（GCN）框架，该框架在每一层执行一个区分聚合，并堆叠直接链接的元路径的信息传播，从而实现了以间接方式选择元路径的关注功能。然后，我们通过引入一个可以与聚合分开的新繁殖操作进行有效的放松和改进。也就是说，我们首先使用定义明确的概率扩散动力学对整个繁殖过程进行建模，然后引入一个基于图的随机约束，该约束可以随着层的增加而减少噪声。广泛的实验证明了新方法比最新方法的优越性。

Heterogeneous information network (HIN) embedding, aiming to map the structure and semantic information in a HIN to distributed representations, has drawn considerable research attention. Graph neural networks for HIN embeddings typically adopt a hierarchical attention (including node-level and meta-path-level attentions) to capture the information from meta-path-based neighbors. However, this complicated attention structure often cannot achieve the function of selecting meta-paths due to severe overfitting. Moreover, when propagating information, these methods do not distinguish direct (one-hop) meta-paths from indirect (multi-hop) ones. But from the perspective of network science, direct relationships are often believed to be more essential, which can only be used to model direct information propagation. To address these limitations, we propose a novel neural network method via implicitly utilizing attention and meta-paths, which can relieve the severe overfitting brought by the current over-parameterized attention mechanisms on HIN. We first use the multi-layer graph convolutional network (GCN) framework, which performs a discriminative aggregation at each layer, along with stacking the information propagation of direct linked meta-paths layer-by-layer, realizing the function of attentions for selecting meta-paths in an indirect way. We then give an effective relaxation and improvement via introducing a new propagation operation which can be separated from aggregation. That is, we first model the whole propagation process with well-defined probabilistic diffusion dynamics, and then introduce a random graph-based constraint which allows it to reduce noise with the increase of layers. Extensive experiments demonstrate the superiority of the new approach over state-of-the-art methods.

下载PDF全文

下载文献需遵守相关版权规定

论文标题