论文标题
在不可分割的主张中搜索结构
Searching for Structure in Unfalsifiable Claims
论文作者
论文摘要
社交媒体平台引起了关于可以想象的每个主题的大量帖子和评论。这些帖子中有许多关于社会各个方面的看法,但是它们的不可分割的本质使它们不适合进行事实检查管道。在这项工作中,我们旨在将这些帖子提炼成一小部分叙述,以捕捉与给定主题相关的基本主张。理解和可视化这些叙述可以促进社交媒体上更明智的辩论。作为迈向系统地识别社交媒体上的基本叙述的第一步,我们介绍了Papyer,这是与公共洗手间中与卫生有关的在线评论的精细数据集,其中包含许多无法分配的主张。我们提出了一条人类的循环管道,该管道结合了机器和人类内核来发现流行的叙述,并表明该管道的表现优于最近的大型变压器模型和最新的无监督主题模型。
Social media platforms give rise to an abundance of posts and comments on every topic imaginable. Many of these posts express opinions on various aspects of society, but their unfalsifiable nature makes them ill-suited to fact-checking pipelines. In this work, we aim to distill such posts into a small set of narratives that capture the essential claims related to a given topic. Understanding and visualizing these narratives can facilitate more informed debates on social media. As a first step towards systematically identifying the underlying narratives on social media, we introduce PAPYER, a fine-grained dataset of online comments related to hygiene in public restrooms, which contains a multitude of unfalsifiable claims. We present a human-in-the-loop pipeline that uses a combination of machine and human kernels to discover the prevailing narratives and show that this pipeline outperforms recent large transformer models and state-of-the-art unsupervised topic models.