羽毛鸟群在一起：通过语言模型差异化的讽刺新闻检测

论文标题

羽毛鸟群在一起：通过语言模型差异化的讽刺新闻检测

Birds of a Feather Flock Together: Satirical News Detection via Language Model Differentiation

论文作者

Zhang, Yigeng, Yang, Fan, Zhang, Yifan, Dragut, Eduard, Mukherjee, Arjun

论文摘要

讽刺新闻经常在现代社交媒体上分享，因为它具有巧妙的嵌入式幽默感。但是，这可能对社会有害，因为由于其欺骗性的性格，有时可能会误认为它是事实新闻。我们发现，在讽刺新闻中，上下文的词汇和务实属性是吸引读者的关键因素。在这项工作中，我们提出了一种区分讽刺新闻和真实新闻的方法。它通过利用两种语言模型的预测丢失（一种对真实新闻培训的培训），利用讽刺的写作证据，当给出了一篇新的新闻文章时。我们计算语言模型预测损失的几个统计指标作为特征，然后将其用于进行下游分类。所提出的方法在计算上是有效的，因为语言模型捕获了讽刺新闻文档和传统新闻文档之间的语言使用差异，并且当应用于其域外的文档时很敏感。

Satirical news is regularly shared in modern social media because it is entertaining with smartly embedded humor. However, it can be harmful to society because it can sometimes be mistaken as factual news, due to its deceptive character. We found that in satirical news, the lexical and pragmatical attributes of the context are the key factors in amusing the readers. In this work, we propose a method that differentiates the satirical news and true news. It takes advantage of satirical writing evidence by leveraging the difference between the prediction loss of two language models, one trained on true news and the other on satirical news, when given a new news article. We compute several statistical metrics of language model prediction loss as features, which are then used to conduct downstream classification. The proposed method is computationally effective because the language models capture the language usage differences between satirical news documents and traditional news documents, and are sensitive when applied to documents outside their domains.

下载PDF全文

下载文献需遵守相关版权规定

论文标题