将光线带入黑暗：对统一框架下的知识图嵌入模型的大规模评估

论文标题

将光线带入黑暗：对统一框架下的知识图嵌入模型的大规模评估

Bringing Light Into the Dark: A Large-scale Evaluation of Knowledge Graph Embedding Models Under a Unified Framework

论文作者

Ali, Mehdi, Berrendorf, Max, Hoyt, Charles Tapley, Vermue, Laurent, Galkin, Mikhail, Sharifzadeh, Sahand, Fischer, Asja, Tresp, Volker, Lehmann, Jens

论文摘要

最近发表的知识嵌入模型的实现，培训和评估的异质性使公平而彻底的比较变得困难。为了评估先前发布的结果的可重复性，我们在Pykeen软件包中重新实现并评估了21个交互模型。在这里，我们概述了哪些结果可以通过其报告的超参数来复制，这只能用替代的超参数复制，并且根本无法再现，并提供有关为什么情况的见解。然后，我们在四个数据集上进行了大规模的基准测试，并进行了数千个实验和24,804个GPU小时的计算时间。我们提供了有关最佳实践，每种型号的最佳配置的见解，以及可以对先前发布的最佳配置进行改进。我们的结果表明，模型架构，训练方法，损耗功能和反关系的明确建模的组合对于模型的性能至关重要，不仅由模型体系结构确定。我们提供的证据表明，仔细配置时，几个架构可以获得与最先进的结果竞争。我们已经制作了所有代码，实验配置，结果和分析，可在https://github.com/pykeen/pykeen和https://github.com/pykeen/benchmarking上获得我们的解释。

The heterogeneity in recently published knowledge graph embedding models' implementations, training, and evaluation has made fair and thorough comparisons difficult. In order to assess the reproducibility of previously published results, we re-implemented and evaluated 21 interaction models in the PyKEEN software package. Here, we outline which results could be reproduced with their reported hyper-parameters, which could only be reproduced with alternate hyper-parameters, and which could not be reproduced at all as well as provide insight as to why this might be the case. We then performed a large-scale benchmarking on four datasets with several thousands of experiments and 24,804 GPU hours of computation time. We present insights gained as to best practices, best configurations for each model, and where improvements could be made over previously published best configurations. Our results highlight that the combination of model architecture, training approach, loss function, and the explicit modeling of inverse relations is crucial for a model's performances, and not only determined by the model architecture. We provide evidence that several architectures can obtain results competitive to the state-of-the-art when configured carefully. We have made all code, experimental configurations, results, and analyses that lead to our interpretations available at https://github.com/pykeen/pykeen and https://github.com/pykeen/benchmarking

下载PDF全文

下载文献需遵守相关版权规定

论文标题