论文标题
Arabglossbert:WSD的上下文 - 格洛斯对微调bert
ArabGlossBERT: Fine-Tuning BERT on Context-Gloss Pairs for WSD
论文作者
论文摘要
使用预训练的变压器模型(例如BERT)已被证明在许多NLP任务中都是有效的。本文介绍了我们的工作,以微调阿拉伯语词感(WSD)的BERT模型。我们将WSD任务视为句子对二进制分类任务。首先,我们从阿拉伯本体论和Birzeit University提供的大型词典数据库中提取了一个标记为阿拉伯上下文对(〜167K对)的数据集(〜167K对)。每对被标记为真或错误,并且在每个上下文中识别和注释目标单词。其次,我们将此数据集用于微调三种预训练的阿拉伯BERT模型。第三,我们试验了用于在上下文中强调目标词的不同监督信号的使用。尽管我们在实验中使用了大量的感觉,但我们的实验取得了令人鼓舞的结果(精度为84%)。
Using pre-trained transformer models such as BERT has proven to be effective in many NLP tasks. This paper presents our work to fine-tune BERT models for Arabic Word Sense Disambiguation (WSD). We treated the WSD task as a sentence-pair binary classification task. First, we constructed a dataset of labeled Arabic context-gloss pairs (~167k pairs) we extracted from the Arabic Ontology and the large lexicographic database available at Birzeit University. Each pair was labeled as True or False and target words in each context were identified and annotated. Second, we used this dataset for fine-tuning three pre-trained Arabic BERT models. Third, we experimented the use of different supervised signals used to emphasize target words in context. Our experiments achieved promising results (accuracy of 84%) although we used a large set of senses in the experiment.