基于嵌入的实体对准知识图的基准测试研究

论文标题

基于嵌入的实体对准知识图的基准测试研究

A Benchmarking Study of Embedding-based Entity Alignment for Knowledge Graphs

论文作者

Sun, Zequn, Zhang, Qingheng, Hu, Wei, Wang, Chengming, Chen, Muhao, Akrami, Farahnaz, Li, Chengkai

论文摘要

实体对齐试图在不同的知识图（kgs）中找到指代相同现实世界对象的实体。 KG嵌入的最新进展促使基于嵌入的实体对齐的出现，该实体对齐是根据所学的嵌入式编码连续嵌入空间和衡量实体相似性的实体。在本文中，我们对这个新兴领域进行了全面的实验研究。我们调查23个基于嵌入的实体对准方法，并根据其技术和特征对它们进行分类。我们还提出了一种新的KG采样算法，我们通过该算法生成了一组专用基准数据集，具有各种异质性和分布，以进行现实评估。我们开发了一个开源库，其中包括12种代表性的基于嵌入的实体对准方法，并广泛评估这些方法，以了解其优势和局限性。此外，对于在当前方法中尚未探讨的几个方向，我们进行了探索性实验，并报告了未来研究的初步发现。基准数据集，开源库和实验结果均可在线访问，并将得到适当维护。

Entity alignment seeks to find entities in different knowledge graphs (KGs) that refer to the same real-world object. Recent advancement in KG embedding impels the advent of embedding-based entity alignment, which encodes entities in a continuous embedding space and measures entity similarities based on the learned embeddings. In this paper, we conduct a comprehensive experimental study of this emerging field. We survey 23 recent embedding-based entity alignment approaches and categorize them based on their techniques and characteristics. We also propose a new KG sampling algorithm, with which we generate a set of dedicated benchmark datasets with various heterogeneity and distributions for a realistic evaluation. We develop an open-source library including 12 representative embedding-based entity alignment approaches, and extensively evaluate these approaches, to understand their strengths and limitations. Additionally, for several directions that have not been explored in current approaches, we perform exploratory experiments and report our preliminary findings for future studies. The benchmark datasets, open-source library and experimental results are all accessible online and will be duly maintained.

下载PDF全文

下载文献需遵守相关版权规定

论文标题