自我监督的边缘功能，用于改进图形神经网络训练

论文标题

自我监督的边缘功能，用于改进图形神经网络训练

Self-supervised edge features for improved Graph Neural Network training

论文作者

Sehanobish, Arijit, Ravindra, Neal G., van Dijk, David

论文摘要

图形神经网络（GNN）已被广泛用于从图形结构化数据中提取有意义的表示，并执行预测任务，例如节点分类和链接预测。近年来，有很多工作结合了边缘功能以及用于预测任务的节点功能。使用边缘功能的主要困难之一是它们通常是手工制作的，难以获得的，特定于特定域，并且可能包含冗余信息。在这项工作中，我们提出了一个框架，用于创建适用于任何领域的新优势功能，通过自我监督和无监督的学习结合。除此之外，我们使用eman-Ricci曲率作为附加边缘特征来封装图形的局部几何形状。然后，我们通过SET变压器对边缘功能进行编码，并将它们与从流行的GNN架构中提取的节点特征在端到端训练方案中提取的节点分类。我们验证了三个生物数据集的工作，其中包括神经系统疾病的单细胞RNA测序数据，\ textit {in Betro} SARS-COV-2感染和Human Covid-19患者。我们证明，我们的方法在节点分类任务上取得了更好的性能，而不是基线图注意网络（GAT）和图形卷积网络（GCN）模型。此外，鉴于边缘和节点特征的注意力机制，我们能够解释确定COVID-19的过程和严重程度的细胞类型和基因，从而有助于越来越多的潜在疾病生物标志物和治疗靶标的列表。

Graph Neural Networks (GNN) have been extensively used to extract meaningful representations from graph structured data and to perform predictive tasks such as node classification and link prediction. In recent years, there has been a lot of work incorporating edge features along with node features for prediction tasks. One of the main difficulties in using edge features is that they are often handcrafted, hard to get, specific to a particular domain, and may contain redundant information. In this work, we present a framework for creating new edge features, applicable to any domain, via a combination of self-supervised and unsupervised learning. In addition to this, we use Forman-Ricci curvature as an additional edge feature to encapsulate the local geometry of the graph. We then encode our edge features via a Set Transformer and combine them with node features extracted from popular GNN architectures for node classification in an end-to-end training scheme. We validate our work on three biological datasets comprising of single-cell RNA sequencing data of neurological disease, \textit{in vitro} SARS-CoV-2 infection, and human COVID-19 patients. We demonstrate that our method achieves better performance on node classification tasks over baseline Graph Attention Network (GAT) and Graph Convolutional Network (GCN) models. Furthermore, given the attention mechanism on edge and node features, we are able to interpret the cell types and genes that determine the course and severity of COVID-19, contributing to a growing list of potential disease biomarkers and therapeutic targets.

下载PDF全文

下载文献需遵守相关版权规定

论文标题