一些局部推文足以用于有效的用户级别的立场检测

论文标题

一些局部推文足以用于有效的用户级别的立场检测

A Few Topical Tweets are Enough for Effective User-Level Stance Detection

论文作者

Samih, Younes, Darwish, Kareem

论文摘要

立场检测需要确定用户对目标的位置，例如实体，主题或主张。采用无监督分类的最新工作表明，在目标上有许多推文的人声Twitter用户执行立场检测可以产生很高的精度（+98％）。但是，对于较少的人声用户，这种方法的性能较差或完全失败，他们可能只撰写了几条有关目标的推文。在本文中，我们使用两种方法为此类用户解决了立场检测。在第一种方法中，我们通过使用上下文化嵌入来表示推文来改善用户级别的立场检测，从而在上下文中捕获单词的潜在含义。我们表明，这种方法在八个有争议的主题上的表现优于两个强大的基准，并达到89.6％的精度和91.3％的宏观量。在第二种方法中，我们使用其Twitter时间轴推文扩展了给定用户的推文，然后我们对用户进行无监督的分类，这意味着在培训集中与其他用户聚类。这种方法可实现95.6％的精度和93.1％的宏F量。

Stance detection entails ascertaining the position of a user towards a target, such as an entity, topic, or claim. Recent work that employs unsupervised classification has shown that performing stance detection on vocal Twitter users, who have many tweets on a target, can yield very high accuracy (+98%). However, such methods perform poorly or fail completely for less vocal users, who may have authored only a few tweets about a target. In this paper, we tackle stance detection for such users using two approaches. In the first approach, we improve user-level stance detection by representing tweets using contextualized embeddings, which capture latent meanings of words in context. We show that this approach outperforms two strong baselines and achieves 89.6% accuracy and 91.3% macro F-measure on eight controversial topics. In the second approach, we expand the tweets of a given user using their Twitter timeline tweets, and then we perform unsupervised classification of the user, which entails clustering a user with other users in the training set. This approach achieves 95.6% accuracy and 93.1% macro F-measure.

下载PDF全文

下载文献需遵守相关版权规定

论文标题