Semeval-2020任务10：预训练的语言模型是否知道要强调什么？

论文标题

Semeval-2020任务10：预训练的语言模型是否知道要强调什么？

IDS at SemEval-2020 Task 10: Does Pre-trained Language Model Know What to Emphasize?

论文作者

Shin, Jaeyoul, Kim, Taeuk, Lee, Sang-goo

论文摘要

我们提出了一种新颖的方法，使我们能够确定值得从视觉媒体中的书面文本中强调的单词，仅依赖于预先训练的语言模型（PLM）的自我发明分布的信息。通过大量的实验和分析，我们表明1）我们的零射击方法优于采用TF-IDF的合理基准，并且2）PLMS中存在一些专门用于重点选择的注意力头，证实PLM能够在句子中识别重要词。

We propose a novel method that enables us to determine words that deserve to be emphasized from written text in visual media, relying only on the information from the self-attention distributions of pre-trained language models (PLMs). With extensive experiments and analyses, we show that 1) our zero-shot approach is superior to a reasonable baseline that adopts TF-IDF and that 2) there exist several attention heads in PLMs specialized for emphasis selection, confirming that PLMs are capable of recognizing important words in sentences.

下载PDF全文

下载文献需遵守相关版权规定

论文标题