论文标题
基于嵌入的实体一致性的行业评估
An Industry Evaluation of Embedding-based Entity Alignment
论文作者
论文摘要
近年来,基于嵌入的实体对齐方式进行了广泛的研究,但是大多数提出的方法仍然依靠理想的监督学习环境,并具有大量无偏种的种子映射进行培训和验证,这大大限制了其使用情况。在这项研究中,我们在工业环境中评估了这些最先进的方法,在工业背景下,探索了不同尺寸和不同偏见的种子映射的影响。除了DBPEDIA和WIKIDATA的流行基准外,我们还贡献和评估了一种新的工业基准,该基准是从部署的医疗应用程序中从两个异质知识图(KGS)中提取的。实验结果可以分析这些一致性方法的优势和缺点,并进一步讨论适合其工业部署的策略。
Embedding-based entity alignment has been widely investigated in recent years, but most proposed methods still rely on an ideal supervised learning setting with a large number of unbiased seed mappings for training and validation, which significantly limits their usage. In this study, we evaluate those state-of-the-art methods in an industrial context, where the impact of seed mappings with different sizes and different biases is explored. Besides the popular benchmarks from DBpedia and Wikidata, we contribute and evaluate a new industrial benchmark that is extracted from two heterogeneous knowledge graphs (KGs) under deployment for medical applications. The experimental results enable the analysis of the advantages and disadvantages of these alignment methods and the further discussion of suitable strategies for their industrial deployment.