公平感知机器学习算法的系统比较的指标和方法

论文标题

公平感知机器学习算法的系统比较的指标和方法

Metrics and methods for a systematic comparison of fairness-aware machine learning algorithms

论文作者

Jones, Gareth P., Hickey, James M., Di Stefano, Pietro G., Dhanjal, Charanpal, Stoddart, Laura C., Vasileiou, Vlasios

论文摘要

从机器学习模型做出的决定中理解和消除偏见对于避免歧视非特权群体至关重要。尽管最近在算法公平方面取得了进展，但对于哪种偏见方法最有效，仍然没有明确的答案。评估策略通常是特定于用例的，依靠具有不明确偏见的数据，并采用固定的策略将模型输出转换为决策结果。为了解决这些问题，我们对适用于监督分类的许多流行公平算法进行了系统的比较。我们的研究是同类中最全面的研究。它利用了三个真实和四个合成数据集，以及将模型输出转换为决策的两种不同的方式。它考虑了28种不同的建模管道的公平性，预测性绩效，校准质量以及速度，对应于公平性 - unaware和公平感知算法。我们发现，公平性 - 统一算法通常无法产生充分的公平模型，并且最简单的算法不一定是最公平的算法。我们还发现，公平感知算法可以在没有预测能力的物质下降的情况下引起公平性。最后，我们发现数据集特质（例如，内在的不公平程度，相关性的性质）确实会影响公平感知方法的性能。我们的结果使从业者可以缩小他们想要采用的方法，而无需事先知道公平要求。

Understanding and removing bias from the decisions made by machine learning models is essential to avoid discrimination against unprivileged groups. Despite recent progress in algorithmic fairness, there is still no clear answer as to which bias-mitigation approaches are most effective. Evaluation strategies are typically use-case specific, rely on data with unclear bias, and employ a fixed policy to convert model outputs to decision outcomes. To address these problems, we performed a systematic comparison of a number of popular fairness algorithms applicable to supervised classification. Our study is the most comprehensive of its kind. It utilizes three real and four synthetic datasets, and two different ways of converting model outputs to decisions. It considers fairness, predictive-performance, calibration quality, and speed of 28 different modelling pipelines, corresponding to both fairness-unaware and fairness-aware algorithms. We found that fairness-unaware algorithms typically fail to produce adequately fair models and that the simplest algorithms are not necessarily the fairest ones. We also found that fairness-aware algorithms can induce fairness without material drops in predictive power. Finally, we found that dataset idiosyncracies (e.g., degree of intrinsic unfairness, nature of correlations) do affect the performance of fairness-aware approaches. Our results allow the practitioner to narrow down the approach(es) they would like to adopt without having to know in advance their fairness requirements.

下载PDF全文

下载文献需遵守相关版权规定

论文标题