SNAC：叙事摘要的连贯错误检测

论文标题

SNAC：叙事摘要的连贯错误检测

SNaC: Coherence Error Detection for Narrative Summarization

论文作者

Goyal, Tanya, Li, Junyi Jessy, Durrett, Greg

论文摘要

缺乏适当的评估框架抑制了总结长期文本的进展。当必须制作长时间的摘要以适当涵盖该文本的各个方面时，该摘要需要提出一个连贯的叙述才能被读者理解，但是当前的自动和人类评估方法无法识别连贯性的差距。在这项工作中，我们介绍了SNAC，这是一个叙事连贯的评估框架，该框架扎根于长期摘要的细粒注释。我们在生成的叙述摘要中开发了一个相干错误的分类法，并收集150本书和电影剧本摘要中6.6k句子的跨度注释。我们的工作提供了最先进的摘要模型产生的相干错误的首次表征，以及一种从人群注释者中引发连贯性判断的协议。此外，我们表明收集的注释使我们能够训练强大的分类器，以自动在生成的摘要中定位相干错误，并在相干建模中对过去的工作进行基准测试。最后，我们的SNAC框架可以支持长期文档摘要和连贯评估的未来工作，包括改进的摘要建模和事后摘要校正。

Progress in summarizing long texts is inhibited by the lack of appropriate evaluation frameworks. When a long summary must be produced to appropriately cover the facets of that text, that summary needs to present a coherent narrative to be understandable by a reader, but current automatic and human evaluation methods fail to identify gaps in coherence. In this work, we introduce SNaC, a narrative coherence evaluation framework rooted in fine-grained annotations for long summaries. We develop a taxonomy of coherence errors in generated narrative summaries and collect span-level annotations for 6.6k sentences across 150 book and movie screenplay summaries. Our work provides the first characterization of coherence errors generated by state-of-the-art summarization models and a protocol for eliciting coherence judgments from crowd annotators. Furthermore, we show that the collected annotations allow us to train a strong classifier for automatically localizing coherence errors in generated summaries as well as benchmarking past work in coherence modeling. Finally, our SNaC framework can support future work in long document summarization and coherence evaluation, including improved summarization modeling and post-hoc summary correction.

下载PDF全文

下载文献需遵守相关版权规定

论文标题