论文标题
BertScore是不公平的:关于基于语言模型的文本指标的社会偏见
BERTScore is Unfair: On Social Bias in Language Model-Based Metrics for Text Generation
论文作者
论文摘要
自动评估指标对于生成系统的开发至关重要。近年来,在各种一代任务中通常采用了基于培训的语言模型(PLM)的指标,例如BertScore。但是,已经证明,PLM编码一系列刻板印象的社会偏见,导致人们关注PLM作为指标的公平性。为此,这项工作介绍了基于PLM指标的社会偏见的首次系统研究。我们证明,基于PLM的流行指标表现出比传统指标在6个敏感属性(即种族,性别,宗教,外表,年龄,年龄和社会经济地位)上的社会偏见明显更高。深入的分析表明,与选择PLM相比,指标选择范式(匹配,回归或产生)对公平性具有更大的影响。此外,我们开发了将其注入PLM层的偏差衔接子,从而减轻基于PLM的指标的偏见,同时保留高性能以评估文本生成。
Automatic evaluation metrics are crucial to the development of generative systems. In recent years, pre-trained language model (PLM) based metrics, such as BERTScore, have been commonly adopted in various generation tasks. However, it has been demonstrated that PLMs encode a range of stereotypical societal biases, leading to a concern on the fairness of PLMs as metrics. To that end, this work presents the first systematic study on the social bias in PLM-based metrics. We demonstrate that popular PLM-based metrics exhibit significantly higher social bias than traditional metrics on 6 sensitive attributes, namely race, gender, religion, physical appearance, age, and socioeconomic status. In-depth analysis suggests that choosing paradigms (matching, regression, or generation) of the metric has a greater impact on fairness than choosing PLMs. In addition, we develop debiasing adapters that are injected into PLM layers, mitigating bias in PLM-based metrics while retaining high performance for evaluating text generation.