LT@Helsinki在Semeval-2020任务12：多语言或特定语言的Bert？

论文标题

LT@Helsinki在Semeval-2020任务12：多语言或特定语言的Bert？

LT@Helsinki at SemEval-2020 Task 12: Multilingual or language-specific BERT?

论文作者

Pàmies, Marc, Öhman, Emily, Kajava, Kaisla, Tiedemann, Jörg

论文摘要

本文介绍了LT@Helsinki团队为2020年共享任务12提交的不同模型。我们的团队参加了子任务A和C；分别标题为“进攻性语言标识和犯罪”目标身份。在这两种情况下，我们都使用了来自Transformer（BERT）的所谓双向编码器表示，该模型由Google预先培训，并由我们在OLID和实心数据集中进行微调。结果表明，进攻性推文分类是BERT可以实现最新结果的几项基于语言的任务之一。

This paper presents the different models submitted by the LT@Helsinki team for the SemEval 2020 Shared Task 12. Our team participated in sub-tasks A and C; titled offensive language identification and offense target identification, respectively. In both cases we used the so-called Bidirectional Encoder Representation from Transformer (BERT), a model pre-trained by Google and fine-tuned by us on the OLID and SOLID datasets. The results show that offensive tweet classification is one of several language-based tasks where BERT can achieve state-of-the-art results.

下载PDF全文

下载文献需遵守相关版权规定

论文标题