话语上下文在印地语单词顺序中的可预测性效应

论文标题

话语上下文在印地语单词顺序中的可预测性效应

Discourse Context Predictability Effects in Hindi Word Order

论文作者

Ranjan, Sidharth, van Schijndel, Marten, Agarwal, Sumeet, Rajkumar, Rajakrishnan

论文摘要

我们检验了话语可预测性影响印地语句法选择的假设。虽然先前的工作表明，许多因素（例如，信息状态，依赖性长度和句法出人意料）会影响印地语单词顺序偏好，但文献中话语可预测性的作用却没有消失。受句法启动的先前工作的启发，我们研究了句子中的单词和句法结构如何影响以下句子的顺序。具体而言，我们从印地语Treebank语料库（HUTB）中提取句子，记录了这些句子的前语言成分，并构建一个分类器来预测对人为产生的干扰物的语料库中实际发生的句子。分类器使用许多基于话语的特征和认知功能来做出预测，包括依赖性长度，惊人和信息状态。我们发现信息状态和基于LSTM的话语可预测性会影响单词顺序的选择，尤其是对于非典型对象的订单。我们通过将结果置于更广泛的句法启动文献中来结束。

We test the hypothesis that discourse predictability influences Hindi syntactic choice. While prior work has shown that a number of factors (e.g., information status, dependency length, and syntactic surprisal) influence Hindi word order preferences, the role of discourse predictability is underexplored in the literature. Inspired by prior work on syntactic priming, we investigate how the words and syntactic structures in a sentence influence the word order of the following sentences. Specifically, we extract sentences from the Hindi-Urdu Treebank corpus (HUTB), permute the preverbal constituents of those sentences, and build a classifier to predict which sentences actually occurred in the corpus against artificially generated distractors. The classifier uses a number of discourse-based features and cognitive features to make its predictions, including dependency length, surprisal, and information status. We find that information status and LSTM-based discourse predictability influence word order choices, especially for non-canonical object-fronted orders. We conclude by situating our results within the broader syntactic priming literature.

下载PDF全文

下载文献需遵守相关版权规定

论文标题