论文标题
CMCL 2022共享任务的团队úfal:弄清楚使用验证语言模型预测眼睛追踪功能的正确秘诀
Team ÚFAL at CMCL 2022 Shared Task: Figuring out the correct recipe for predicting Eye-Tracking features using Pretrained Language Models
论文作者
论文摘要
吸引人的数据是研究认知,尤其是人类语言理解的非常有用的信息来源。在本文中,我们描述了CMCL 2022共享任务的系统,以预测眼睛跟踪信息。我们用预审预测的模型(例如BERT和XLM)描述了我们的实验,以及我们使用这些表示形式预测四个眼睛追踪功能的不同方式。除了分析使用两种不同类型的多语言语言模型和汇总令牌表示形式的不同方式的效果外,我们还探讨了上下文信息如何影响系统性能。最后,我们还探索是否会增加语言信息等因素影响预测。我们的意见书的平均MAE为5.72,在共同任务中排名第五。在任务后评估中,平均MAE进一步减少到5.25。
Eye-Tracking data is a very useful source of information to study cognition and especially language comprehension in humans. In this paper, we describe our systems for the CMCL 2022 shared task on predicting eye-tracking information. We describe our experiments with pretrained models like BERT and XLM and the different ways in which we used those representations to predict four eye-tracking features. Along with analysing the effect of using two different kinds of pretrained multilingual language models and different ways of pooling the tokenlevel representations, we also explore how contextual information affects the performance of the systems. Finally, we also explore if factors like augmenting linguistic information affect the predictions. Our submissions achieved an average MAE of 5.72 and ranked 5th in the shared task. The average MAE showed further reduction to 5.25 in post task evaluation.