论文标题
Semeval-2022任务12:首先提取关系的AIFB-Webscience - 使用关系提取来识别实体
AIFB-WebScience at SemEval-2022 Task 12: Relation Extraction First -- Using Relation Extraction to Identify Entities
论文作者
论文摘要
在本文中,我们提出了一种基于基于变压器的语言模型的端到端联合实体和关系提取方法。我们将模型应用于将数学符号与乳胶文档中的描述联系起来的任务。与按顺序执行实体和关系提取的现有方法相反,我们的系统将关系提取中的信息结合到实体提取中。这意味着即使在只有所有有效实体跨度的子集的数据集上,也可以对系统进行训练。我们对拟议系统及其优势和劣势进行了广泛的评估。我们的方法可以在推理时间时在计算复杂性中动态缩放,以高精度产生预测,并在Semeval-2022任务的排行榜中达到第三名。对于物理和数学领域的输入,它可以分别达到高度息息相关的宏观F1评分,分别为95.43%和79.43%和79.17%。用于培训和评估我们的模型的代码可在以下网址提供:https://github.com/nicpopovic/re1st
In this paper, we present an end-to-end joint entity and relation extraction approach based on transformer-based language models. We apply the model to the task of linking mathematical symbols to their descriptions in LaTeX documents. In contrast to existing approaches, which perform entity and relation extraction in sequence, our system incorporates information from relation extraction into entity extraction. This means that the system can be trained even on data sets where only a subset of all valid entity spans is annotated. We provide an extensive evaluation of the proposed system and its strengths and weaknesses. Our approach, which can be scaled dynamically in computational complexity at inference time, produces predictions with high precision and reaches 3rd place in the leaderboard of SemEval-2022 Task 12. For inputs in the domain of physics and math, it achieves high relation extraction macro F1 scores of 95.43% and 79.17%, respectively. The code used for training and evaluating our models is available at: https://github.com/nicpopovic/RE1st