Dualner：零击的跨语言命名实体识别的双教学框架

论文标题

Dualner：零击的跨语言命名实体识别的双教学框架

DualNER: A Dual-Teaching framework for Zero-shot Cross-lingual Named Entity Recognition

论文作者

Zeng, Jiali, Jiang, Yufan, Yin, Yongjing, Wang, Xu, Lin, Binghuai, Cao, Yunbo

论文摘要

我们提出了Dualner，这是一个简单有效的框架，可以充分利用带注释的源语言语料库和未标记的目标语言文本，用于零摄像的跨语言命名实体识别（NER）。特别是，我们将NER的两个互补学习范式（即序列标记和跨度预测）结合到一个统一的多任务框架中。在获得了对源数据进行训练的足够的NER模型后，我们以{\ IT双教学}方式将其进一步训练了目标数据，其中从另一个任务的预测中构建了一个用于一个任务的伪标签。此外，根据跨度预测，提出了一个实体感知的正则化，以增强不同语言中相同实体之间的内在跨语性对齐。实验和分析证明了我们双重的有效性。代码可在https://github.com/lemon0830/dualner上找到。

We present DualNER, a simple and effective framework to make full use of both annotated source language corpus and unlabeled target language text for zero-shot cross-lingual named entity recognition (NER). In particular, we combine two complementary learning paradigms of NER, i.e., sequence labeling and span prediction, into a unified multi-task framework. After obtaining a sufficient NER model trained on the source data, we further train it on the target data in a {\it dual-teaching} manner, in which the pseudo-labels for one task are constructed from the prediction of the other task. Moreover, based on the span prediction, an entity-aware regularization is proposed to enhance the intrinsic cross-lingual alignment between the same entities in different languages. Experiments and analysis demonstrate the effectiveness of our DualNER. Code is available at https://github.com/lemon0830/dualNER.

下载PDF全文

下载文献需遵守相关版权规定

论文标题