论文标题
不变学习中的公平和鲁棒性:毒性分类的案例研究
Fairness and Robustness in Invariant Learning: A Case Study in Toxicity Classification
论文作者
论文摘要
鲁棒性在机器学习中至关重要,并且已经引起了域的概括和不变学习领域,这与改善与培训分布不同但与培训分布相关的测试分布的性能。鉴于最近的工作暗示了公平与鲁棒性之间的紧密联系,我们研究了是否可以使用鲁棒ML的算法来改善经过有偏见的数据训练并对无偏见的数据进行测试的分类器的公平性。我们应用了不变风险最小化(IRM),这是一种采用因果发现启发的方法来查找强大预测因子的域泛化算法,用于公平预测互联网评论的毒性的任务。我们表明,IRM比经验风险最小化(ERM)方法实现了更好的分布准确性和公平性,并分析在实践中应用IRM时出现的困难以及在这种情况下IRM可能有效的条件。我们希望这项工作能够激发进一步研究强大的机器学习方法与算法公平的关系。
Robustness is of central importance in machine learning and has given rise to the fields of domain generalization and invariant learning, which are concerned with improving performance on a test distribution distinct from but related to the training distribution. In light of recent work suggesting an intimate connection between fairness and robustness, we investigate whether algorithms from robust ML can be used to improve the fairness of classifiers that are trained on biased data and tested on unbiased data. We apply Invariant Risk Minimization (IRM), a domain generalization algorithm that employs a causal discovery inspired method to find robust predictors, to the task of fairly predicting the toxicity of internet comments. We show that IRM achieves better out-of-distribution accuracy and fairness than Empirical Risk Minimization (ERM) methods, and analyze both the difficulties that arise when applying IRM in practice and the conditions under which IRM will likely be effective in this scenario. We hope that this work will inspire further studies of how robust machine learning methods relate to algorithmic fairness.