论文标题

X-Stance:用于立场检测的多语言多目标数据集

X-Stance: A Multilingual Multi-Target Dataset for Stance Detection

论文作者

Vamvas, Jannis, Sennrich, Rico

论文摘要

我们从瑞士选举候选人撰写的评论中提取了一个大规模的立场检测数据集。数据集由德语,法语和意大利文本组成,允许对立场检测进行跨语性评估。它对150多个政治问题(目标)中包含67 000条评论。与具有特定目标问题的立场检测模型不同,我们使用数据集在所有问题上训练单个模型。为了使跨目标的学习成为可能,我们将每个实例的自然问题做好了代​​表的自然问题(例如,“您支持X?”)。多语言BERT的基线结果表明,使用这种方法,零射击的跨语义和跨目标转移在中度成功。

We extract a large-scale stance detection dataset from comments written by candidates of elections in Switzerland. The dataset consists of German, French and Italian text, allowing for a cross-lingual evaluation of stance detection. It contains 67 000 comments on more than 150 political issues (targets). Unlike stance detection models that have specific target issues, we use the dataset to train a single model on all the issues. To make learning across targets possible, we prepend to each instance a natural question that represents the target (e.g. "Do you support X?"). Baseline results from multilingual BERT show that zero-shot cross-lingual and cross-target transfer of stance detection is moderately successful with this approach.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源