论文标题
Gluecos:代码开关NLP的评估基准
GLUECoS : An Evaluation Benchmark for Code-Switched NLP
论文作者
论文摘要
代码转换是在同一对话或话语中使用多种语言。最近,对多个单语言语料库进行培训的多语言上下文嵌入模型已显示出关于跨语性和多语言任务的有希望的结果。我们提供了用于代码开关语言的评估基准Gluecos,它涵盖了英语印度和英语 - 西班牙语的几个NLP任务。具体来说,我们的评估基准包括文本,pos标记,命名实体识别,情感分析,问题回答以及代码转换,自然语言推断的新任务的语言识别。我们使用跨语性单词嵌入模型和多语言模型在所有这些任务上介绍了结果。此外,我们还对人为生成的代码切换数据微调了多语言模型。尽管多语言模型的性能要比跨语性模型要好得多,但我们的结果表明,在大多数任务中,在两种语言对中,在代码切换数据上进行了微调的多语言模型的性能最佳,这表明可以进一步优化多语言模型,以针对代码转换任务进行优化。
Code-switching is the use of more than one language in the same conversation or utterance. Recently, multilingual contextual embedding models, trained on multiple monolingual corpora, have shown promising results on cross-lingual and multilingual tasks. We present an evaluation benchmark, GLUECoS, for code-switched languages, that spans several NLP tasks in English-Hindi and English-Spanish. Specifically, our evaluation benchmark includes Language Identification from text, POS tagging, Named Entity Recognition, Sentiment Analysis, Question Answering and a new task for code-switching, Natural Language Inference. We present results on all these tasks using cross-lingual word embedding models and multilingual models. In addition, we fine-tune multilingual models on artificially generated code-switched data. Although multilingual models perform significantly better than cross-lingual models, our results show that in most tasks, across both language pairs, multilingual models fine-tuned on code-switched data perform best, showing that multilingual models can be further optimized for code-switching tasks.