用批处理和钢化单词搬运距离改进文本生成评估

论文标题

用批处理和钢化单词搬运距离改进文本生成评估

Improving Text Generation Evaluation with Batch Centering and Tempered Word Mover Distance

论文作者

Chen, Xi, Ding, Nan, Levinboim, Tomer, Soricut, Radu

论文摘要

文本自动评估指标的最新进展表明，深层上下文化的单词表示（例如由Bert编码者生成的表示）有助于设计与人类判断良好相关的指标。同时，有人认为，上下文化的单词表示形式具有亚最佳统计属性，用于编码单词或句子之间的真实相似性。在本文中，我们提出了两种用于改进相似性指标的编码表示形式的技术：一种改善统计属性的批处理中心策略；以及计算高效的钢化单词搬运距离，以更好地融合上下文化的单词表示中的信息。我们进行数值实验，证明了我们技术的鲁棒性，报告了各种伯特 - 背面学习的指标，并在几个基准上与人类评分达到了最新的相关性。

Recent advances in automatic evaluation metrics for text have shown that deep contextualized word representations, such as those generated by BERT encoders, are helpful for designing metrics that correlate well with human judgements. At the same time, it has been argued that contextualized word representations exhibit sub-optimal statistical properties for encoding the true similarity between words or sentences. In this paper, we present two techniques for improving encoding representations for similarity metrics: a batch-mean centering strategy that improves statistical properties; and a computationally efficient tempered Word Mover Distance, for better fusion of the information in the contextualized word representations. We conduct numerical experiments that demonstrate the robustness of our techniques, reporting results over various BERT-backbone learned metrics and achieving state of the art correlation with human ratings on several benchmarks.

下载PDF全文

下载文献需遵守相关版权规定

论文标题