论文标题
基于单词混合和gru的科学和技术文本知识提取方法
Scientific and Technological Text Knowledge Extraction Method of based on Word Mixing and GRU
论文作者
论文摘要
知识提取任务是从非结构化的文本数据中提取三重关系(头部实体汇总尾部实体)。现有的知识提取方法分为“管道”方法和关节提取方法。 “管道”方法是分开命名的实体识别和实体关系提取,并使用自己的模块提取它们。尽管此方法具有更好的灵活性,但训练速度却很慢。联合提取的学习模型是由神经网络实施的端到端模型,以同时实现实体识别和关系提取,可以很好地保留实体与关系之间的关联,并将实体和关系的共同提取转化为序列注释问题。在本文中,我们提出了一种基于单词混合物和gru的科学和技术资源的知识提取方法,结合了单词混合物矢量映射方法和自我注意力,以有效地提高文本关系提取对中国科学和技术资源的影响。
The knowledge extraction task is to extract triple relations (head entity-relation-tail entity) from unstructured text data. The existing knowledge extraction methods are divided into "pipeline" method and joint extraction method. The "pipeline" method is to separate named entity recognition and entity relationship extraction and use their own modules to extract them. Although this method has better flexibility, the training speed is slow. The learning model of joint extraction is an end-to-end model implemented by neural network to realize entity recognition and relationship extraction at the same time, which can well preserve the association between entities and relationships, and convert the joint extraction of entities and relationships into a sequence annotation problem. In this paper, we propose a knowledge extraction method for scientific and technological resources based on word mixture and GRU, combined with word mixture vector mapping method and self-attention mechanism, to effectively improve the effect of text relationship extraction for Chinese scientific and technological resources.