论文标题

神经模型的句法惊喜可以预测,但低估了句法歧义的人类处理难度

Syntactic Surprisal From Neural Models Predicts, But Underestimates, Human Processing Difficulty From Syntactic Ambiguities

论文作者

Arehalli, Suhas, Dillon, Brian, Linzen, Tal

论文摘要

人类表现出园艺路径的影响:当阅读暂时的结构模棱两可的句子时,当结构被歧视以支持较少首选的替代方案时,它们会放慢脚步。惊奇理论(Hale,2001; Levy,2008)对这一发现的重要解释提出,这些放缓是由于这些句子中每个单词发生的不可预测性所致。 Van Schijndel&Linzen(2021)挑战了这一假设,发现从语言模型中得出的单词可预测性成本的估计严重低估了人类花园路径效应的幅度。在这项工作中,我们考虑了这种低估是否是由于人类在预测中的权重比语言模型更高的事实。我们提出了一种从语言模型估算句法可预测性的方法,使我们能够独立权衡词汇和句法可预测性的成本。我们发现,独立于词汇可预测性的句法可预测性确实会导致更大的园林路径估计值。同时,即使句法可预测性被独立加权,惊奇仍然大大低估了人类花园路径效应的幅度。我们的结果支持以下假设:可预测性并不是唯一与花园路径句子相关的处理成本负责的因素。

Humans exhibit garden path effects: When reading sentences that are temporarily structurally ambiguous, they slow down when the structure is disambiguated in favor of the less preferred alternative. Surprisal theory (Hale, 2001; Levy, 2008), a prominent explanation of this finding, proposes that these slowdowns are due to the unpredictability of each of the words that occur in these sentences. Challenging this hypothesis, van Schijndel & Linzen (2021) find that estimates of the cost of word predictability derived from language models severely underestimate the magnitude of human garden path effects. In this work, we consider whether this underestimation is due to the fact that humans weight syntactic factors in their predictions more highly than language models do. We propose a method for estimating syntactic predictability from a language model, allowing us to weigh the cost of lexical and syntactic predictability independently. We find that treating syntactic predictability independently from lexical predictability indeed results in larger estimates of garden path. At the same time, even when syntactic predictability is independently weighted, surprisal still greatly underestimate the magnitude of human garden path effects. Our results support the hypothesis that predictability is not the only factor responsible for the processing cost associated with garden path sentences.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源