论文标题
分析种族,性别和交叉轴的仇恨言论数据
Analyzing Hate Speech Data along Racial, Gender and Intersectional Axes
论文作者
论文摘要
为了解决仇恨言论的现象不断上升的现象,已经为数据策展和分析做出了努力。在分析偏见时,以前的工作主要集中在种族上。在我们的工作中,我们进一步调查了种族,性别和交叉轴的仇恨言论数据集中的偏见。我们确定了对非裔美国人英语(AAE),男性和AAE+男性推文的强烈偏见,这些推文比其他人口统计学中的仇恨和冒犯性更高。我们提供的证据表明,基于BERT的模型传播了这一偏见,并表明在这些受保护属性的培训数据中平衡可以导致有关性别的模型,而不是种族。
To tackle the rising phenomenon of hate speech, efforts have been made towards data curation and analysis. When it comes to analysis of bias, previous work has focused predominantly on race. In our work, we further investigate bias in hate speech datasets along racial, gender and intersectional axes. We identify strong bias against African American English (AAE), masculine and AAE+Masculine tweets, which are annotated as disproportionately more hateful and offensive than from other demographics. We provide evidence that BERT-based models propagate this bias and show that balancing the training data for these protected attributes can lead to fairer models with regards to gender, but not race.