魔鬼在详细信息中：评估基于变形金刚的粒状任务的局限性

论文标题

魔鬼在详细信息中：评估基于变形金刚的粒状任务的局限性

The Devil is in the Details: Evaluating Limitations of Transformer-based Methods for Granular Tasks

论文作者

Joshi, Brihi, Shah, Neil, Barbieri, Francesco, Neves, Leonardo

论文摘要

从基于变压器的神经语言模型得出的上下文嵌入已经显示了各种任务的最新性能，例如近年来问答，情感分析和文本相似性。广泛的工作表明了此类模型可以如何代表文本中存在的抽象，语义信息。在这项说明性工作中，我们探讨了切线的方向，并在需要更精细的表示级别的任务上分析了此类模型的性能。我们从两个角度的角度关注文本相似性的问题：在颗粒层的匹配文档（需要嵌入以捕获文本中的细粒度属性）和抽象级别（需要嵌入以捕获整体文本语义）。我们从不同领域的两个数据集中进行了经验证明，尽管抽象文档中的高性能匹配，但上下文嵌入的嵌入始终如一（有时，有时，极有时）的表现超过了TF-IDF（例如TF-IDF）的简单基线，以实现更精细的任务。然后，我们提出了一种简单但有效的方法，将TF-IDF纳入使用上下文嵌入的模型中，从而在颗粒任务上实现了高达36％的相对改进。

Contextual embeddings derived from transformer-based neural language models have shown state-of-the-art performance for various tasks such as question answering, sentiment analysis, and textual similarity in recent years. Extensive work shows how accurately such models can represent abstract, semantic information present in text. In this expository work, we explore a tangent direction and analyze such models' performance on tasks that require a more granular level of representation. We focus on the problem of textual similarity from two perspectives: matching documents on a granular level (requiring embeddings to capture fine-grained attributes in the text), and an abstract level (requiring embeddings to capture overall textual semantics). We empirically demonstrate, across two datasets from different domains, that despite high performance in abstract document matching as expected, contextual embeddings are consistently (and at times, vastly) outperformed by simple baselines like TF-IDF for more granular tasks. We then propose a simple but effective method to incorporate TF-IDF into models that use contextual embeddings, achieving relative improvements of up to 36% on granular tasks.

下载PDF全文

下载文献需遵守相关版权规定

论文标题