论文标题
在关系数据库上进行审计时的片状表演
Flaky Performances when Pretraining on Relational Databases
论文作者
论文摘要
我们探讨了图形神经网络(GNN)的下游任务性能自我监督学习(SSL)方法,该方法是根据从关系数据库(RDB)提取的子图培训的。从直觉上讲,SSL和GNN的这种联合使用应允许利用更多可用数据,这可以转化为更好的结果。但是,我们发现,天真的移植对比度SSL技术可能会导致``负转移'':对固定模型的固定表示的线性评估的性能要比随机定位模型的表示情况差。基于对比的SSL与GNN的消息传递层发生冲突的猜想,我们提出了Inmonode:一种旨在最大化节点初始和最终层表示之间相互信息的对比损失。主要的经验结果支持我们的猜想和Inmonode的有效性。
We explore the downstream task performances for graph neural network (GNN) self-supervised learning (SSL) methods trained on subgraphs extracted from relational databases (RDBs). Intuitively, this joint use of SSL and GNNs should allow to leverage more of the available data, which could translate to better results. However, we found that naively porting contrastive SSL techniques can cause ``negative transfer'': linear evaluation on fixed representations from a pretrained model performs worse than on representations from the randomly-initialized model. Based on the conjecture that contrastive SSL conflicts with the message passing layers of the GNN, we propose InfoNode: a contrastive loss aiming to maximize the mutual information between a node's initial- and final-layer representation. The primary empirical results support our conjecture and the effectiveness of InfoNode.