好：图形分布式基准测试

论文标题

好：图形分布式基准测试

GOOD: A Graph Out-of-Distribution Benchmark

论文作者

Gui, Shurui, Li, Xiner, Wang, Limei, Ji, Shuiwang

论文摘要

分数（OOD）学习涉及培训和测试数据遵循不同分布的方案。尽管已经在机器学习中深入研究了一般的OOD问题，但图形OOD只是一个新兴领域。当前，缺少针对图形OOD方法评估的系统基准测试。在这项工作中，我们旨在为图表开发一个被称为Good的OOD基准。我们明确地在协变量和概念变化和设计数据拆分之间进行了区分，并准确地反映了不同的变化。我们考虑图形和节点预测任务，因为设计轮班存在关键差异。总体而言，Good包含11个具有17个域选择的数据集。当与协变量，概念和无移位结合使用时，我们获得了51种不同的分裂。我们在10种常见的基线方法上提供了10种随机运行的性能结果。这总共导致510个数据集模型组合。我们的结果表明，分布和OOD设置之间的性能差距很大。我们的结果还阐明了通过不同方法的协变量和概念转移之间的不同性能趋势。我们的良好基准是一个不断增长的项目，并希望随着该地区的发展，数量和种类繁多。可以通过https://github.com/divelab/good/访问好基准。

Out-of-distribution (OOD) learning deals with scenarios in which training and test data follow different distributions. Although general OOD problems have been intensively studied in machine learning, graph OOD is only an emerging area of research. Currently, there lacks a systematic benchmark tailored to graph OOD method evaluation. In this work, we aim at developing an OOD benchmark, known as GOOD, for graphs specifically. We explicitly make distinctions between covariate and concept shifts and design data splits that accurately reflect different shifts. We consider both graph and node prediction tasks as there are key differences in designing shifts. Overall, GOOD contains 11 datasets with 17 domain selections. When combined with covariate, concept, and no shifts, we obtain 51 different splits. We provide performance results on 10 commonly used baseline methods with 10 random runs. This results in 510 dataset-model combinations in total. Our results show significant performance gaps between in-distribution and OOD settings. Our results also shed light on different performance trends between covariate and concept shifts by different methods. Our GOOD benchmark is a growing project and expects to expand in both quantity and variety of resources as the area develops. The GOOD benchmark can be accessed via https://github.com/divelab/GOOD/.

下载PDF全文

下载文献需遵守相关版权规定

论文标题