组合：新闻中叙事结构的数据集

论文标题

组合：新闻中叙事结构的数据集

CompRes: A Dataset for Narrative Structure in News

论文作者

Levi, Effi, Mor, Guy, Shenhav, Shaul, Sheafer, Tamir

论文摘要

本文解决了在原始文本中自动检测叙事结构的任务。先前的作品利用了Labov和Waletzky的口头叙事理论来识别个人故事文本中的各种叙事元素。取而代之的是，我们将重点放在新闻文章上，这些新闻文章是由于它们日益增长的社会影响以及它们在创造和塑造公众舆论中的作用。我们介绍了新闻媒体中叙事结构的第一个数据集。我们描述了构建数据集的过程：首先，我们设计了一种新的叙事注释方案，更适合新闻媒体，通过调整Labov和Waletzky的叙事理论（复杂性和解决方案）的元素，并添加了我们自己的（成功）的新叙述元素；然后，我们使用该方案注释了一组从新闻和党派网站收集的29篇英语新闻文章（包含1,099个句子）。我们使用带注释的数据集训练多个监督模型来识别不同的叙述元素，从而达到$ f_1 $得分高达0.7。最后，我们提出了一些有希望的未来工作的指示。

This paper addresses the task of automatically detecting narrative structures in raw texts. Previous works have utilized the oral narrative theory by Labov and Waletzky to identify various narrative elements in personal stories texts. Instead, we direct our focus to news articles, motivated by their growing social impact as well as their role in creating and shaping public opinion. We introduce CompRes -- the first dataset for narrative structure in news media. We describe the process in which the dataset was constructed: first, we designed a new narrative annotation scheme, better suited for news media, by adapting elements from the narrative theory of Labov and Waletzky (Complication and Resolution) and adding a new narrative element of our own (Success); then, we used that scheme to annotate a set of 29 English news articles (containing 1,099 sentences) collected from news and partisan websites. We use the annotated dataset to train several supervised models to identify the different narrative elements, achieving an $F_1$ score of up to 0.7. We conclude by suggesting several promising directions for future work.

下载PDF全文

下载文献需遵守相关版权规定

论文标题