一个用于超值提取和立方体填充方法的数据集

论文标题

一个用于超值提取和立方体填充方法的数据集

A Dataset for Hyper-Relational Extraction and a Cube-Filling Approach

论文作者

Chia, Yew Ken, Bing, Lidong, Aljunied, Sharifah Mahani, Si, Luo, Poria, Soujanya

论文摘要

关系提取具有大规模知识图构造的潜力，但是当前方法并未考虑每个关系三重态的预选赛属性，例如时间，数量或位置。预选赛形成了超级关系的事实，可以更好地捕获丰富而复杂的知识图结构。例如，可以通过包括预选赛（结束时间，1967年）来丰富关系三重态（伦纳德·帕克（Leonard Parker），在哈佛大学接受教育）。因此，我们建议从文本中提取更具体而完整的事实的超关系提取的任务。为了支持任务，我们构建了HyperRed，一个大规模和通用数据集。现有模型无法执行超级关系提取，因为它要求模型考虑三个实体之间的相互作用。因此，我们提出了Cubere，这是一种受桌子填充方法启发的堆积模型，并明确考虑了关系三胞胎和预选赛之间的相互作用。为了提高模型可伸缩性并降低负类别的不平衡，我们进一步提出了立方体修复方法。我们的实验表明，Cubere的表现优于强大的基线，并揭示了未来研究的可能方向。我们的代码和数据可在github.com/declare-lab/hyperred上找到。

Relation extraction has the potential for large-scale knowledge graph construction, but current methods do not consider the qualifier attributes for each relation triplet, such as time, quantity or location. The qualifiers form hyper-relational facts which better capture the rich and complex knowledge graph structure. For example, the relation triplet (Leonard Parker, Educated At, Harvard University) can be factually enriched by including the qualifier (End Time, 1967). Hence, we propose the task of hyper-relational extraction to extract more specific and complete facts from text. To support the task, we construct HyperRED, a large-scale and general-purpose dataset. Existing models cannot perform hyper-relational extraction as it requires a model to consider the interaction between three entities. Hence, we propose CubeRE, a cube-filling model inspired by table-filling approaches and explicitly considers the interaction between relation triplets and qualifiers. To improve model scalability and reduce negative class imbalance, we further propose a cube-pruning method. Our experiments show that CubeRE outperforms strong baselines and reveal possible directions for future research. Our code and data are available at github.com/declare-lab/HyperRED.

下载PDF全文

下载文献需遵守相关版权规定

论文标题