通过刻面Rasch测量和多任务深度学习来构建间隔变量：仇恨语音应用

论文标题

通过刻面Rasch测量和多任务深度学习来构建间隔变量：仇恨语音应用

Constructing interval variables via faceted Rasch measurement and multitask deep learning: a hate speech application

论文作者

Kennedy, Chris J., Bacon, Geoff, Sahn, Alexander, von Vacano, Claudia

论文摘要

我们提出了一种通用方法，通过将监督深度学习与构造措施方法相结合，以衡量连续的间隔频谱上的复杂变量，以构造rasch项目响应理论（IRT）。我们将目标结构（在我们的案例中仇恨言论）分解为多个组成部分，这些组成部分被标记为有序调查项目。这些调查反应通过IRT转化为一种依据，连续的结果度量。我们的方法估计了人类标记的调查解释偏见，并消除了对生成的连续度量的影响。我们进一步估计了使用刻面IRT的每个标签质量的响应质量，从而可以去除低质量标签的响应。我们的刻面Rasch缩放程序自然地与多任务深度学习体系结构集成在一起，以实现新数据的自动预测。目标结局的理论组成部分的评分被用作受监督的神经网络内部概念学习的有序变量。我们测试了激活函数（序数软词）和损耗函数（序数横膜）的使用，旨在利用顺序结果变量的结构。我们的多任务结构导致了一种新的模型解释形式，因为每个连续的预测都可以由倒数第二层中的组成部分直接解释。我们在50,000个社交媒体评论的数据集上演示了这种新方法，这些评论来自YouTube，Twitter和Reddit，并由11,000个美国的亚马逊机械Turk工人标记，以测量从仇恨言论到Counterspeech的连续频谱。我们将通用句子编码器，Bert和Roberta评估为评论文本的语言表示模型，并将我们的预测准确性与Google Jigsaw的观点API模型进行比较，显示出比该标准基准的显着改善。

We propose a general method for measuring complex variables on a continuous, interval spectrum by combining supervised deep learning with the Constructing Measures approach to faceted Rasch item response theory (IRT). We decompose the target construct, hate speech in our case, into multiple constituent components that are labeled as ordinal survey items. Those survey responses are transformed via IRT into a debiased, continuous outcome measure. Our method estimates the survey interpretation bias of the human labelers and eliminates that influence on the generated continuous measure. We further estimate the response quality of each labeler using faceted IRT, allowing responses from low-quality labelers to be removed. Our faceted Rasch scaling procedure integrates naturally with a multitask deep learning architecture for automated prediction on new data. The ratings on the theorized components of the target outcome are used as supervised, ordinal variables for the neural networks' internal concept learning. We test the use of an activation function (ordinal softmax) and loss function (ordinal cross-entropy) designed to exploit the structure of ordinal outcome variables. Our multitask architecture leads to a new form of model interpretation because each continuous prediction can be directly explained by the constituent components in the penultimate layer. We demonstrate this new method on a dataset of 50,000 social media comments sourced from YouTube, Twitter, and Reddit and labeled by 11,000 U.S.-based Amazon Mechanical Turk workers to measure a continuous spectrum from hate speech to counterspeech. We evaluate Universal Sentence Encoders, BERT, and RoBERTa as language representation models for the comment text, and compare our predictive accuracy to Google Jigsaw's Perspective API models, showing significant improvement over this standard benchmark.

下载PDF全文

下载文献需遵守相关版权规定

论文标题