semeval-2022任务的USTC-NELSLIP 11：宪报适应的多语言复合体的集成网络名称nentity识别

论文标题

semeval-2022任务的USTC-NELSLIP 11：宪报适应的多语言复合体的集成网络名称nentity识别

USTC-NELSLIP at SemEval-2022 Task 11: Gazetteer-Adapted Integration Network for Multilingual Complex Named Entity Recognition

论文作者

Chen, Beiduo, Ma, Jun-Yu, Qi, Jiajun, Guo, Wu, Ling, Zhen-Hua, Liu, Quan

论文摘要

本文介绍了USTC-NELSLIP团队针对Semeval-2022任务11多语言复合体命名实体识别（Multiconer）开发的系统。我们提出了一个适应宪报的集成网络（增益），以提高语言模型的性能，以识别复杂的命名实体。该方法首先通过最大程度地减少它们之间的KL差异来调整Gazetteer网络的表示形式。改编后，将这两个网络集成以进行后端监督命名实体识别（NER）培训。所提出的方法应用于几种由Wikidata构建的Gazetteer的基于最先进的变压器的NER模型，并在它们遍布它们的概括能力。最终预测来自这些训练有素的模型的集合。实验结果和详细分析验证了所提出的方法的有效性。官方的结果表明，我们的系统在三个曲目（中文，代码混合和孟加拉）中排名第一，在此任务中的其他十个轨道上排名第二。

This paper describes the system developed by the USTC-NELSLIP team for SemEval-2022 Task 11 Multilingual Complex Named Entity Recognition (MultiCoNER). We propose a gazetteer-adapted integration network (GAIN) to improve the performance of language models for recognizing complex named entities. The method first adapts the representations of gazetteer networks to those of language models by minimizing the KL divergence between them. After adaptation, these two networks are then integrated for backend supervised named entity recognition (NER) training. The proposed method is applied to several state-of-the-art Transformer-based NER models with a gazetteer built from Wikidata, and shows great generalization ability across them. The final predictions are derived from an ensemble of these trained models. Experimental results and detailed analysis verify the effectiveness of the proposed method. The official results show that our system ranked 1st on three tracks (Chinese, Code-mixed and Bangla) and 2nd on the other ten tracks in this task.

下载PDF全文

下载文献需遵守相关版权规定

论文标题