用i $^2 $ -gnns增强图形神经网络的周期计数能力

论文标题

用i $^2 $ -gnns增强图形神经网络的周期计数能力

Boosting the Cycle Counting Power of Graph Neural Networks with I$^2$-GNNs

论文作者

Huang, Yinan, Peng, Xingang, Ma, Jianzhu, Zhang, Muhan

论文摘要

消息传递神经网络（MPNN）是一类广泛使用的图形神经网络（GNNS）。 MPNNS的有限代表力激发了对可证明的强大GNN体系结构的研究。但是，了解一个模型比另一个模型更强大，几乎没有关于他们可以或不能表达的功能的见解。目前尚不清楚这些模型是否能够近似特定功能，例如计算某些图形子结构，这对于生物学，化学和社交网络分析中的应用至关重要。在此激励的情况下，我们建议研究子图MPNN的计数能力，该子图MPNN是一种最新且流行的强大GNN模型，它们为每个节点提取根源的词根子图，为根节点分配一个唯一的标识符，并在其词根子级中编码根节点的表示。具体而言，我们证明子图MPNN无法在节点级别计数比4个循环的数量，这表明节点表示无法正确编码周围的子结构，例如具有四个以上原子的环形系统。为了克服这一限制，我们建议通过在每个子图中为根节点及其邻居分配不同的标识符来扩展子图MPNN。 I $^2 $ -GNNS的判别能力严格比子图MPNN强，并且比3-WL测试更强大。更重要的是，事实证明，I $^2 $ gnns能够计算所有3、4、5和6个周期，涵盖了有机化学中的苯苯环等常见子结构，同时仍保持线性复杂性。据我们所知，这是第一个线性GNN模型，它可以用理论保证计算6个循环。我们在周期计数任务中验证其计数能力，并在分子预测基准中证明其竞争性能。

Message Passing Neural Networks (MPNNs) are a widely used class of Graph Neural Networks (GNNs). The limited representational power of MPNNs inspires the study of provably powerful GNN architectures. However, knowing one model is more powerful than another gives little insight about what functions they can or cannot express. It is still unclear whether these models are able to approximate specific functions such as counting certain graph substructures, which is essential for applications in biology, chemistry and social network analysis. Motivated by this, we propose to study the counting power of Subgraph MPNNs, a recent and popular class of powerful GNN models that extract rooted subgraphs for each node, assign the root node a unique identifier and encode the root node's representation within its rooted subgraph. Specifically, we prove that Subgraph MPNNs fail to count more-than-4-cycles at node level, implying that node representations cannot correctly encode the surrounding substructures like ring systems with more than four atoms. To overcome this limitation, we propose I$^2$-GNNs to extend Subgraph MPNNs by assigning different identifiers for the root node and its neighbors in each subgraph. I$^2$-GNNs' discriminative power is shown to be strictly stronger than Subgraph MPNNs and partially stronger than the 3-WL test. More importantly, I$^2$-GNNs are proven capable of counting all 3, 4, 5 and 6-cycles, covering common substructures like benzene rings in organic chemistry, while still keeping linear complexity. To the best of our knowledge, it is the first linear-time GNN model that can count 6-cycles with theoretical guarantees. We validate its counting power in cycle counting tasks and demonstrate its competitive performance in molecular prediction benchmarks.

下载PDF全文

下载文献需遵守相关版权规定

论文标题