可转让的命名实体识别的半监督分开的框架

论文标题

可转让的命名实体识别的半监督分开的框架

Semi-Supervised Disentangled Framework for Transferable Named Entity Recognition

论文作者

Hao, Zhifeng, Lv, Di, Li, Zijian, Cai, Ruichu, Wen, Wen, Xu, Boyan

论文摘要

指定的实体识别（NER）用于识别非结构化文本中的专有名词是自然语言处理中最重要，最基本的任务之一。但是，尽管使用NER模型广泛使用，但它们仍然需要大规模标记的数据集，这会由于手动注释而造成沉重的负担。域的适应性是该问题最有前途的解决方案之一，在该解决方案中，利用来自相关源域的丰富标记数据来增强基于目标域的模型的普遍性。但是，主流跨域NER模型仍然受到以下两个挑战的影响（1）提取域不变信息，例如用于跨域转移的句法信息。（2）将特定于域的信息（例如语义信息）集成到模型中，以提高NER的性能。在这项研究中，我们提出了一个可转移NER的半监督框架，该框架删除了域不变的潜在变量和特定于域的潜在变量。在提出的框架中，通过使用域预测变量将特定于域的特定信息与域特异性潜在变量集成在一起。 The domain-specific and domain-invariant latent variables are disentangled using three mutual information regularization terms, i.e., maximizing the mutual information between the domain-specific latent variables and the original embedding, maximizing the mutual information between the domain-invariant latent variables and the original embedding, and minimizing the mutual information between the domain-specific and domain-invariant latent variables.广泛的实验表明，我们的模型可以通过跨域和跨语言基准数据集获得最先进的性能。

Named entity recognition (NER) for identifying proper nouns in unstructured text is one of the most important and fundamental tasks in natural language processing. However, despite the widespread use of NER models, they still require a large-scale labeled data set, which incurs a heavy burden due to manual annotation. Domain adaptation is one of the most promising solutions to this problem, where rich labeled data from the relevant source domain are utilized to strengthen the generalizability of a model based on the target domain. However, the mainstream cross-domain NER models are still affected by the following two challenges (1) Extracting domain-invariant information such as syntactic information for cross-domain transfer. (2) Integrating domain-specific information such as semantic information into the model to improve the performance of NER. In this study, we present a semi-supervised framework for transferable NER, which disentangles the domain-invariant latent variables and domain-specific latent variables. In the proposed framework, the domain-specific information is integrated with the domain-specific latent variables by using a domain predictor. The domain-specific and domain-invariant latent variables are disentangled using three mutual information regularization terms, i.e., maximizing the mutual information between the domain-specific latent variables and the original embedding, maximizing the mutual information between the domain-invariant latent variables and the original embedding, and minimizing the mutual information between the domain-specific and domain-invariant latent variables. Extensive experiments demonstrated that our model can obtain state-of-the-art performance with cross-domain and cross-lingual NER benchmark data sets.

下载PDF全文

下载文献需遵守相关版权规定

论文标题