论文标题

Astraea:基于语法的公平测试

Astraea: Grammar-based Fairness Testing

论文作者

Soremekun, Ezekiel, Udeshi, Sakshi, Chattopadhyay, Sudipta

论文摘要

软件通常会产生偏见的输出。特别是,已知基于机器的软件在处理歧视性输入时会产生错误的预测。这种不公平的计划行为可能是由社会偏见引起的。在过去的几年中,亚马逊,微软和Google提供了产生不公平输出的软件服务,这主要是由于社会偏见(例如性别或种族)。在此类事件中,开发人员对进行公平测试的任务感到不满。公平测试具有挑战性;开发人员的任务是生成揭示和解释偏见的歧视性输入。 我们提出了一种基于语法的公平测试方法(称为Astraea),该方法利用无上下文的语法来产生歧视性输入,以揭示软件系统中的公平性违规行为。使用概率语法,Astraea还通过隔离观察到的软件偏见的原因来提供故障诊断。 Astraea的诊断有助于改善ML公平。 Astraea在18个提供三种主要自然语言处理(NLP)服务的软件系统上进行了评估。在我们的评估中,Astraea产生了约18%的公平违规行为。 Astraea产生了超过573k的歧视性测试案例,发现了超过102K的公平违规行为。此外,Astraea通过模型重新训练将软件公平性提高了约76%。

Software often produces biased outputs. In particular, machine learning (ML) based software are known to produce erroneous predictions when processing discriminatory inputs. Such unfair program behavior can be caused by societal bias. In the last few years, Amazon, Microsoft and Google have provided software services that produce unfair outputs, mostly due to societal bias (e.g. gender or race). In such events, developers are saddled with the task of conducting fairness testing. Fairness testing is challenging; developers are tasked with generating discriminatory inputs that reveal and explain biases. We propose a grammar-based fairness testing approach (called ASTRAEA) which leverages context-free grammars to generate discriminatory inputs that reveal fairness violations in software systems. Using probabilistic grammars, ASTRAEA also provides fault diagnosis by isolating the cause of observed software bias. ASTRAEA's diagnoses facilitate the improvement of ML fairness. ASTRAEA was evaluated on 18 software systems that provide three major natural language processing (NLP) services. In our evaluation, ASTRAEA generated fairness violations with a rate of ~18%. ASTRAEA generated over 573K discriminatory test cases and found over 102K fairness violations. Furthermore, ASTRAEA improves software fairness by ~76%, via model-retraining.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源