使用事实知识自动检测实体操纵文本

论文标题

使用事实知识自动检测实体操纵文本

Automatic Detection of Entity-Manipulated Text using Factual Knowledge

论文作者

Jawahar, Ganesh, Abdul-Mageed, Muhammad, Lakshmanan, Laks V. S.

论文摘要

在这项工作中，我们重点介绍了将人类书面新闻文章与新闻文章区分开来的问题，该新闻文章是通过操纵人类书面新闻文章中的实体创建的（例如，用实际上不正确的实体代替实体）。这种操纵的文章可能会误导读者作为人类的书面新闻文章。我们提出了一个基于神经网络的检测器，该检测器通过推理文章中提到的事实来检测被操纵的新闻文章。我们提出的检测器通过图形卷积神经网络利用事实知识以及新闻文章中的文本信息。我们还通过考虑生成新替换实体的各种策略（例如，从GPT-2生成的实体生成）来为此任务创建具有挑战性的数据集。在所有设置中，我们提出的模型要么就准确性匹配或优于最先进的检测器。我们的代码和数据可在https://github.com/ubc-nlp/manipulation_entity_detection上找到。

In this work, we focus on the problem of distinguishing a human written news article from a news article that is created by manipulating entities in a human written news article (e.g., replacing entities with factually incorrect entities). Such manipulated articles can mislead the reader by posing as a human written news article. We propose a neural network based detector that detects manipulated news articles by reasoning about the facts mentioned in the article. Our proposed detector exploits factual knowledge via graph convolutional neural network along with the textual information in the news article. We also create challenging datasets for this task by considering various strategies to generate the new replacement entity (e.g., entity generation from GPT-2). In all the settings, our proposed model either matches or outperforms the state-of-the-art detector in terms of accuracy. Our code and data are available at https://github.com/UBC-NLP/manipulated_entity_detection.

下载PDF全文

下载文献需遵守相关版权规定

论文标题