语法越好，语义越好？探测英语比较相关性的审慎语言模型

论文标题

语法越好，语义越好？探测英语比较相关性的审慎语言模型

The Better Your Syntax, the Better Your Semantics? Probing Pretrained Language Models for the English Comparative Correlative

论文作者

Weissweiler, Leonie, Hofmann, Valentin, Köksal, Abdullatif, Schütze, Hinrich

论文摘要

构建语法（CXG）是来自认知语言学的范式，强调语法与语义之间的联系。它没有在词汇项目上运行的规则，而是将构造视为语言的中心构建基块，即结合语法和语义的不同粒度的语言单位。作为评估CXG与最先进的审前语言模型（PLM）所证明的CXG兼容性和语义知识兼容性的第一步，我们介绍了它们对其对最常见的研究结构进行分类和理解的能力的研究，即英语比较相关性（CC）。我们进行了一方面检查句法探针的分类精度的实验，另一方面，模型在语义应用程序任务中的行为，而Bert，Roberta和Deberta是示例PLMS。我们的结果表明，所有三个研究的PLM都能够识别CC的结构，但无法使用其含义。尽管据称PLM在许多NLP任务上的人类表现都表明，这表明PLM在语言知识的中央领域仍然存在很大的缺点。

Construction Grammar (CxG) is a paradigm from cognitive linguistics emphasising the connection between syntax and semantics. Rather than rules that operate on lexical items, it posits constructions as the central building blocks of language, i.e., linguistic units of different granularity that combine syntax and semantics. As a first step towards assessing the compatibility of CxG with the syntactic and semantic knowledge demonstrated by state-of-the-art pretrained language models (PLMs), we present an investigation of their capability to classify and understand one of the most commonly studied constructions, the English comparative correlative (CC). We conduct experiments examining the classification accuracy of a syntactic probe on the one hand and the models' behaviour in a semantic application task on the other, with BERT, RoBERTa, and DeBERTa as the example PLMs. Our results show that all three investigated PLMs are able to recognise the structure of the CC but fail to use its meaning. While human-like performance of PLMs on many NLP tasks has been alleged, this indicates that PLMs still suffer from substantial shortcomings in central domains of linguistic knowledge.

下载PDF全文

下载文献需遵守相关版权规定

论文标题