Dore：基于生成框架的文档订购关系提取

论文标题

Dore：基于生成框架的文档订购关系提取

DORE: Document Ordered Relation Extraction based on Generative Framework

论文作者

Guo, Qipeng, Yang, Yuqing, Yan, Hang, Qiu, Xipeng, Zhang, Zheng

论文摘要

近年来，有一系列基于一代的信息提取工作，可以更直接地使用预训练的语言模型并有效地捕获输出依赖性。但是，使用词汇表示的先前生成方法并不自然拟合文档级别的关系提取（DOCRE），其中有多个实体和关系事实。在本文中，我们研究了现有生成DOCRE模型表现不佳的根本原因，并发现罪魁祸首是训练范式的不足，而不是模型的能力。我们建议从关系矩阵中生成一个符号和有序的序列，该序列是确定性且更容易学习模型的序列。此外，我们设计了一种平行的行生成方法来处理较长的目标序列。此外，我们引入了几种负面抽样策略，以通过平衡信号来提高性能。四个数据集的实验结果表明，我们提出的方法可以改善生成DOCRE模型的性能。我们已经在https://github.com/ayyyq/dore上发布了代码。

In recent years, there is a surge of generation-based information extraction work, which allows a more direct use of pre-trained language models and efficiently captures output dependencies. However, previous generative methods using lexical representation do not naturally fit document-level relation extraction (DocRE) where there are multiple entities and relational facts. In this paper, we investigate the root cause of the underwhelming performance of the existing generative DocRE models and discover that the culprit is the inadequacy of the training paradigm, instead of the capacities of the models. We propose to generate a symbolic and ordered sequence from the relation matrix which is deterministic and easier for model to learn. Moreover, we design a parallel row generation method to process overlong target sequences. Besides, we introduce several negative sampling strategies to improve the performance with balanced signals. Experimental results on four datasets show that our proposed method can improve the performance of the generative DocRE models. We have released our code at https://github.com/ayyyq/DORE.

下载PDF全文

下载文献需遵守相关版权规定

论文标题