在大型RDF数据上查找最小连接的子图和本体探索

论文标题

在大型RDF数据上查找最小连接的子图和本体探索

Finding Minimum Connected Subgraphs with Ontology Exploration on Large RDF Data

论文作者

Ren, Xiangnan, Sengupta, Neha, Ren, Xuguang, Wang, Junhu, Curé, Olivier

论文摘要

在本文中，我们研究了以下问题：给定知识图（kg）和一组输入顶点（代表概念或实体）和边缘标签，我们旨在找到包含所有输入的最小连接子图。这个问题在基于KG的搜索引擎和自然语言答案系统中起着关键作用，这是Steiner树问题的自然扩展，该问题已知是NP-HARD。我们提出侦察，这是一个用于查找近似答案的系统。侦察的目的是通过瞬时响应（即，秒/毫秒延迟）对具有数亿个边缘的瞬时响应（即，秒/毫秒延迟）实现高精度，而无需诉诸昂贵的计算资源。此外，当由于概念和实体之间的断开连接而没有答案时，侦察会根据本体论将输入提高到语义上相似的输入，并试图找到有关精制输入的答案。我们对侦察进行了全面的实验评估。特别是，我们将其与发现近似施泰纳树的五种现有方法进行了比较。我们对四个大型真实和合成KGS的实验表明，侦察大大优于其竞争对手，并产生了小得多的记忆足迹。

In this paper, we study the following problem: given a knowledge graph (KG) and a set of input vertices (representing concepts or entities) and edge labels, we aim to find the smallest connected subgraphs containing all of the inputs. This problem plays a key role in KG-based search engines and natural language question answering systems, and it is a natural extension of the Steiner tree problem, which is known to be NP-hard. We present RECON, a system for finding approximate answers. RECON aims at achieving high accuracy with instantaneous response (i.e., sub-second/millisecond delay) over KGs with hundreds of millions edges without resorting to expensive computational resources. Furthermore, when no answer exists due to disconnection between concepts and entities, RECON refines the input to a semantically similar one based on the ontology, and attempt to find answers with respect to the refined input. We conduct a comprehensive experimental evaluation of RECON. In particular we compare it with five existing approaches for finding approximate Steiner trees. Our experiments on four large real and synthetic KGs show that RECON significantly outperforms its competitors and incurs a much smaller memory footprint.

下载PDF全文

下载文献需遵守相关版权规定

论文标题