论文标题
重新标记噪声:通过合作多种设备的联合提取实体和关系
Relabel the Noise: Joint Extraction of Entities and Relations via Cooperative Multiagents
论文作者
论文摘要
基于远的实体和关系提取的基于遥远的监督方法由于这些方法需要轻巧的人体注释工作,因此获得了越来越多的流行。在本文中,我们考虑了\ textIt {移位标签分布}的问题,这是由噪声标记的训练集之间的不一致造成的,这些训练集受外部知识图和人类注销的测试集的约束,并因管道上的实体而加剧,而在噪声传播的情况下提取了触发方式。我们提出了一种联合提取方法来解决此问题,通过与一组合作的多游戏重新标记嘈杂的实例。为了以细粒度的方式处理嘈杂的实例,合作组中的每个代理人通过从其自身的角度计算持续的置信度得分来评估实例;为了利用这两个提取任务之间的相关性,置信度共识模块旨在收集所有代理商的智慧,并用置信度得分的标签重新分发嘈杂的训练集。此外,信心用于调整提取器的训练损失。两个现实世界数据集的实验结果验证了重新标记噪声实例的好处,并表明所提出的模型显着胜过最先进的实体和关系提取方法。
Distant supervision based methods for entity and relation extraction have received increasing popularity due to the fact that these methods require light human annotation efforts. In this paper, we consider the problem of \textit{shifted label distribution}, which is caused by the inconsistency between the noisy-labeled training set subject to external knowledge graph and the human-annotated test set, and exacerbated by the pipelined entity-then-relation extraction manner with noise propagation. We propose a joint extraction approach to address this problem by re-labeling noisy instances with a group of cooperative multiagents. To handle noisy instances in a fine-grained manner, each agent in the cooperative group evaluates the instance by calculating a continuous confidence score from its own perspective; To leverage the correlations between these two extraction tasks, a confidence consensus module is designed to gather the wisdom of all agents and re-distribute the noisy training set with confidence-scored labels. Further, the confidences are used to adjust the training losses of extractors. Experimental results on two real-world datasets verify the benefits of re-labeling noisy instance, and show that the proposed model significantly outperforms the state-of-the-art entity and relation extraction methods.