论文标题
迈向可概括的对比度学习:信息理论的观点
Towards Generalizable Graph Contrastive Learning: An Information Theory Perspective
论文作者
论文摘要
图形对比学习(GCL)是图表表示学习的最具代表性的方法,该方法利用最大化互信息(Infomax)的原理学习在下游任务中应用的节点表示。为了探索从GCL到下游任务的更好的概括,先前的方法启发定义数据增强或借口任务。但是,GCL及其理论原理的概括能力仍然较少报道。在本文中,我们首先提出了一个名为GCL-GE的指标,以用于GCL泛化能力。考虑到由于不可知论的下游任务而导致的指标的棘手性,理论上我们从信息理论的角度证明了它的相互信息上限。在界限的指导下,我们设计了一个名为InfoAdv的GCL框架,具有增强的泛化能力,该框架共同优化了概括度量和信息,以在下游任务上的借口任务拟合与概括能力之间达到正确的平衡。我们从经验上验证了许多代表性基准的理论发现,实验结果表明我们的模型实现了最新的性能。
Graph contrastive learning (GCL) emerges as the most representative approach for graph representation learning, which leverages the principle of maximizing mutual information (InfoMax) to learn node representations applied in downstream tasks. To explore better generalization from GCL to downstream tasks, previous methods heuristically define data augmentation or pretext tasks. However, the generalization ability of GCL and its theoretical principle are still less reported. In this paper, we first propose a metric named GCL-GE for GCL generalization ability. Considering the intractability of the metric due to the agnostic downstream task, we theoretically prove a mutual information upper bound for it from an information-theoretic perspective. Guided by the bound, we design a GCL framework named InfoAdv with enhanced generalization ability, which jointly optimizes the generalization metric and InfoMax to strike the right balance between pretext task fitting and the generalization ability on downstream tasks. We empirically validate our theoretical findings on a number of representative benchmarks, and experimental results demonstrate that our model achieves state-of-the-art performance.