论文标题
关系三重提取:一个步骤就足够了
Relational Triple Extraction: One Step is Enough
论文作者
论文摘要
从非结构化文本中提取关系三元是自然语言处理和知识图构造中的重要任务。现有方法通常包含两个基本步骤:(1)找到头部和尾部实体的边界位置; (2)串联特定令牌以形成三元组。但是,几乎所有以前的方法都遇到了误差累积问题,即,步骤(1)中每个实体的边界识别误差将累积到最终的组合三元组中。为了解决问题,在本文中,我们介绍了一个新的视角来重新审视三重提取任务,并提出了一个简单但有效的模型,名为Directrel。具体而言,所提出的模型首先通过句子中的列举令牌序列生成候选实体,然后将三重提取任务转换为“ head $ \ rightarrow $ tail”两部分图上的链接问题。通过这样做,所有三元组只能在一个步骤中直接提取。在两个广泛使用的数据集上进行的广泛实验结果表明,所提出的模型的性能要比最新的基线更好。
Extracting relational triples from unstructured text is an essential task in natural language processing and knowledge graph construction. Existing approaches usually contain two fundamental steps: (1) finding the boundary positions of head and tail entities; (2) concatenating specific tokens to form triples. However, nearly all previous methods suffer from the problem of error accumulation, i.e., the boundary recognition error of each entity in step (1) will be accumulated into the final combined triples. To solve the problem, in this paper, we introduce a fresh perspective to revisit the triple extraction task, and propose a simple but effective model, named DirectRel. Specifically, the proposed model first generates candidate entities through enumerating token sequences in a sentence, and then transforms the triple extraction task into a linking problem on a "head $\rightarrow$ tail" bipartite graph. By doing so, all triples can be directly extracted in only one step. Extensive experimental results on two widely used datasets demonstrate that the proposed model performs better than the state-of-the-art baselines.