论文标题
OpenStance:现实世界中的零镜立场检测
OpenStance: Real-world Zero-shot Stance Detection
论文作者
论文摘要
先前对零镜立场检测的研究确定了文本对同一文档语料库中发生的看不见主题的态度。这种任务公式具有三个限制:(i)单域/数据集。从单个域上对特定数据集进行了优化。因此,所得系统在其他数据集上无法正常工作。 (ii)该模型对有限数量的看不见的主题进行评估; (iii)假定部分主题具有丰富的注释,这在现实世界中可能是不可能的。这些缺点将导致一个不切实际的立场检测系统,该系统未能推广到打开域和开放形式的主题。这项工作定义了OpenStance:开放域零射击姿势检测,旨在处理既没有域约束也不是特定于主题的注释的开放世界中的立场检测。 OpenStance的关键挑战在于开放域的概括:学习具有完全不明确监督的系统,但能够概括到任何数据集。为了解决OpenStance,我们建议将间接监督,从文本需要数据集和弱监督结合起,从预先训练的语言模型自动生成的数据中结合了弱监督。我们的单个系统,没有任何特定于主题的监督,在三个流行的数据集上优于监督方法。据我们所知,这是研究在开放域零射击设置下进行立场检测的第一项工作。所有数据和代码均公开发布。
Prior studies of zero-shot stance detection identify the attitude of texts towards unseen topics occurring in the same document corpus. Such task formulation has three limitations: (i) Single domain/dataset. A system is optimized on a particular dataset from a single domain; therefore, the resulting system cannot work well on other datasets; (ii) the model is evaluated on a limited number of unseen topics; (iii) it is assumed that part of the topics has rich annotations, which might be impossible in real-world applications. These drawbacks will lead to an impractical stance detection system that fails to generalize to open domains and open-form topics. This work defines OpenStance: open-domain zero-shot stance detection, aiming to handle stance detection in an open world with neither domain constraints nor topic-specific annotations. The key challenge of OpenStance lies in the open-domain generalization: learning a system with fully unspecific supervision but capable of generalizing to any dataset. To solve OpenStance, we propose to combine indirect supervision, from textual entailment datasets, and weak supervision, from data generated automatically by pre-trained Language Models. Our single system, without any topic-specific supervision, outperforms the supervised method on three popular datasets. To our knowledge, this is the first work that studies stance detection under the open-domain zero-shot setting. All data and code are publicly released.