论文标题
通过分析语言特征的伪造评论检测
Fake Reviews Detection through Analysis of Linguistic Features
论文作者
论文摘要
在线评论在企业的成功或失败中起着不可或缺的作用。在购买服务或商品之前,客户首先查看以前客户提交的在线评论。但是,可以通过发布伪造和虚假评论来表面上提高或阻碍某些企业。本文探讨了一种自然语言处理方法来识别虚假评论。我们提供了对语言特征的详细分析,以区分虚假和值得信赖的在线评论。我们研究15种语言特征,并衡量它们对本研究中采用的分类方案的重要性和重要性。我们的结果表明,虚假评论往往包含更多余的条款和暂停,并且通常包含更长的句子。几种机器学习分类算法的应用表明,使用这些语言特征,我们能够以很高的精度将假评论与真实评论区分开。
Online reviews play an integral part for success or failure of businesses. Prior to purchasing services or goods, customers first review the online comments submitted by previous customers. However, it is possible to superficially boost or hinder some businesses through posting counterfeit and fake reviews. This paper explores a natural language processing approach to identify fake reviews. We present a detailed analysis of linguistic features for distinguishing fake and trustworthy online reviews. We study 15 linguistic features and measure their significance and importance towards the classification schemes employed in this study. Our results indicate that fake reviews tend to include more redundant terms and pauses, and generally contain longer sentences. The application of several machine learning classification algorithms revealed that we were able to discriminate fake from real reviews with high accuracy using these linguistic features.