论文标题
BIC:具有文本图形交互和语义一致性的Twitter机器人检测
BIC: Twitter Bot Detection with Text-Graph Interaction and Semantic Consistency
论文作者
论文摘要
Twitter机器人是由恶意演员操作的自动程序,以操纵公众舆论并传播错误信息。已经进行了研究工作,以根据社交媒体上的文本和网络自动识别机器人。现有方法仅仅利用文本或网络,而很少有作品探讨了这两种方式的浅层组合,但我们假设文本和图形之间的交互和信息交换对于整体评估社交媒体上的机器人活动至关重要。此外,根据最近的一项调查(Cresci,2020年),Twitter机器人不断发展,而高级机器人窃取了真正的用户的推文并将其恶意内容稀释以逃避检测。这会导致在新型Twitter机器人的时间表中更大的不一致,这值得更多关注。鉴于这些挑战,我们提出了BIC,这是一个具有文本图形相互作用和语义一致性的Twitter机器人检测框架。具体而言,除了在社交媒体上单独建模两种模式外,BIC还采用了文本图形交互模块来启用学习过程中跨模式的信息交换。此外,鉴于新型Twitter机器人的窃取行为,BIC提议根据注意力重量在推文中建模语义一致性,同时使用它来增强决策过程。广泛的实验表明,在两个广泛采用的数据集上,BIC始终优于最先进的基线。进一步的分析表明,文本图形相互作用和建模语义一致性是基本的改进,有助于对抗机器人的演变。
Twitter bots are automatic programs operated by malicious actors to manipulate public opinion and spread misinformation. Research efforts have been made to automatically identify bots based on texts and networks on social media. Existing methods only leverage texts or networks alone, and while few works explored the shallow combination of the two modalities, we hypothesize that the interaction and information exchange between texts and graphs could be crucial for holistically evaluating bot activities on social media. In addition, according to a recent survey (Cresci, 2020), Twitter bots are constantly evolving while advanced bots steal genuine users' tweets and dilute their malicious content to evade detection. This results in greater inconsistency across the timeline of novel Twitter bots, which warrants more attention. In light of these challenges, we propose BIC, a Twitter Bot detection framework with text-graph Interaction and semantic Consistency. Specifically, in addition to separately modeling the two modalities on social media, BIC employs a text-graph interaction module to enable information exchange across modalities in the learning process. In addition, given the stealing behavior of novel Twitter bots, BIC proposes to model semantic consistency in tweets based on attention weights while using it to augment the decision process. Extensive experiments demonstrate that BIC consistently outperforms state-of-the-art baselines on two widely adopted datasets. Further analyses reveal that text-graph interactions and modeling semantic consistency are essential improvements and help combat bot evolution.