论文标题
基于深层上下文嵌入模型的阿拉伯假新闻检测
Arabic Fake News Detection Based on Deep Contextualized Embedding Models
论文作者
论文摘要
由于其易用性和使用自由,社交媒体正在成为许多人的新闻来源。结果,虚假新闻不管其信誉如何,尤其是在过去的十年中。假新闻出版商利用了关键情况,例如19日大流行和美国总统大选,以对社会产生负面影响。虚假新闻可能会严重影响许多领域的社会,包括政治,金融,体育等。许多研究已经进行了许多研究,以帮助用英语检测假新闻,但是对以阿拉伯语进行假新闻发现进行的研究很少。我们的贡献是双重的:首先,我们构建了一个庞大而多样的阿拉伯假新闻数据集。其次,我们已经开发和评估了基于变压器的分类器,以识别假新闻,同时利用八个最先进的阿拉伯上下文化嵌入模型。这些模型中的大多数以前没有用于阿拉伯假新闻检测。我们对最先进的阿拉伯上下文化嵌入模型进行了详尽的分析,并与类似的假新闻检测系统进行了比较。实验结果证实,这些最先进的模型是可靠的,精度超过98%。
Social media is becoming a source of news for many people due to its ease and freedom of use. As a result, fake news has been spreading quickly and easily regardless of its credibility, especially in the last decade. Fake news publishers take advantage of critical situations such as the Covid-19 pandemic and the American presidential elections to affect societies negatively. Fake news can seriously impact society in many fields including politics, finance, sports, etc. Many studies have been conducted to help detect fake news in English, but research conducted on fake news detection in the Arabic language is scarce. Our contribution is twofold: first, we have constructed a large and diverse Arabic fake news dataset. Second, we have developed and evaluated transformer-based classifiers to identify fake news while utilizing eight state-of-the-art Arabic contextualized embedding models. The majority of these models had not been previously used for Arabic fake news detection. We conduct a thorough analysis of the state-of-the-art Arabic contextualized embedding models as well as comparison with similar fake news detection systems. Experimental results confirm that these state-of-the-art models are robust, with accuracy exceeding 98%.