论文标题
TAXINLI:乘坐NLU山
TaxiNLI: Taking a Ride up the NLU Hill
论文作者
论文摘要
预先训练的基于变压器的神经体系结构一直在自然语言推理(NLI)任务中达到最新的表现。由于NLI示例包括各种语言,逻辑和推理现象,因此尚不清楚训练有素的系统学习哪些特定概念以及它们可以实现强大的概括。为了调查这个问题,我们提出了与NLI任务相关的类别的分类层次结构。我们介绍了一个新数据集Tuxinli,其中包含来自MNLI数据集的10K示例(Williams等,2018),并带有这些分类标签。通过有关Taxinli的各种实验,我们观察到,尽管对于某些分类类别类别,SOTA神经模型已经取得了几乎完美的精度 - 比以前的模型大大跳跃,但某些类别仍然很困难。我们的作品增加了不断增长的文献,通过系统的表现和推理类别的分析,显示了当前NLI系统和数据集中的差距。
Pre-trained Transformer-based neural architectures have consistently achieved state-of-the-art performance in the Natural Language Inference (NLI) task. Since NLI examples encompass a variety of linguistic, logical, and reasoning phenomena, it remains unclear as to which specific concepts are learnt by the trained systems and where they can achieve strong generalization. To investigate this question, we propose a taxonomic hierarchy of categories that are relevant for the NLI task. We introduce TAXINLI, a new dataset, that has 10k examples from the MNLI dataset (Williams et al., 2018) with these taxonomic labels. Through various experiments on TAXINLI, we observe that whereas for certain taxonomic categories SOTA neural models have achieved near perfect accuracies - a large jump over the previous models - some categories still remain difficult. Our work adds to the growing body of literature that shows the gaps in the current NLI systems and datasets through a systematic presentation and analysis of reasoning categories.