变压器模型是否显示出与特定于任务的人类目光相似的注意力模式？

论文标题

变压器模型是否显示出与特定于任务的人类目光相似的注意力模式？

Do Transformer Models Show Similar Attention Patterns to Task-Specific Human Gaze?

论文作者

Brandl, Stephanie, Eberle, Oliver, Pilot, Jonas, Søgaard, Anders

论文摘要

在最先进的NLP模型中学习的自我注意事件通常与人类注意力相关。我们研究大规模预训练的语言模型中的自我注意力是否可以预测任务阅读过程中人类眼睛固定模式，就像人类关注的经典认知模型一样。我们比较了两个特定任务的阅读数据集中的注意力功能，以进行情感分析和关系提取。我们发现，大规模预训练的自我注意力对人类注意力的预测取决于“尾巴中的内容”，例如稀有背景的句法性质。此外，我们观察到特定于任务的微调不会增加与人类特定于人类任务的阅读的相关性。通过减少输入实验，我们对稀疏性和忠诚度权衡提供了互补的见解，表明较低的注意力载体更忠实。

Learned self-attention functions in state-of-the-art NLP models often correlate with human attention. We investigate whether self-attention in large-scale pre-trained language models is as predictive of human eye fixation patterns during task-reading as classical cognitive models of human attention. We compare attention functions across two task-specific reading datasets for sentiment analysis and relation extraction. We find the predictiveness of large-scale pre-trained self-attention for human attention depends on `what is in the tail', e.g., the syntactic nature of rare contexts. Further, we observe that task-specific fine-tuning does not increase the correlation with human task-specific reading. Through an input reduction experiment we give complementary insights on the sparsity and fidelity trade-off, showing that lower-entropy attention vectors are more faithful.

下载PDF全文

下载文献需遵守相关版权规定

论文标题