所有人公平：最佳及时公平保证分类

论文标题

所有人公平：最佳及时公平保证分类

Fair for All: Best-effort Fairness Guarantees for Classification

论文作者

Krishnaswamy, Anilesh K., Jiang, Zhihao, Wang, Kangning, Cheng, Yu, Munagala, Kamesh

论文摘要

基于群体公平概念的标准方法，例如\ emph {parity}和\ emph {均衡的赔率}，尝试在已知的群体（基于种族，性别等）之间均等地均等地衡量表现的绝对度量。因此，本质上难以分类的小组可能会阻止其他群体的表现。并且无法为不可预见的团体提供任何保证。取而代之的是，我们提出了一个公平的概念，其保证在$ \ mathcal {g} $中的每个组$ g $上都与$ g $上最佳分类器的性能相对。我们将此概念应用于广泛的组，特别是，其中（a）$ \ MATHCAL {G} $由数据中的所有可能的组（子集）组成，并且（b）$ \ Mathcal {G} $更简化。对于第一个类似于完全未知的组的设置，我们设计了{\ sc pf}（比例公平）分类器，该分类器可以保证，在任何可能的$ g $上，这是一种与$ g $的最佳分类器成比例的准确性，该准确性由$ g $的最佳分类器的准确性，按数据集的相对大小缩放。由于包括所有可能的群体，其中一些可能太复杂而无法相关，因此对于较小的子集来说，最严重的理论保证必须成比例地弱。在第二个设置中，我们设计了{\ sc befair}（最佳订单公平）框架，该框架在\ Mathcal {g} $中的每个$ g \ in \ g $上的最佳分类器的每个$ g \ in \ Mathcal {g} $中都具有准确性。旨在确保这种保证会导致非凸问题，我们设计了新颖的技术来解决这个困难时，当$ \ nathcal {g} $是一组线性假设。我们在现实世界数据集上测试算法，并就其性能提出了有趣的比较见解。

Standard approaches to group-based notions of fairness, such as \emph{parity} and \emph{equalized odds}, try to equalize absolute measures of performance across known groups (based on race, gender, etc.). Consequently, a group that is inherently harder to classify may hold back the performance on other groups; and no guarantees can be provided for unforeseen groups. Instead, we propose a fairness notion whose guarantee, on each group $g$ in a class $\mathcal{G}$, is relative to the performance of the best classifier on $g$. We apply this notion to broad classes of groups, in particular, where (a) $\mathcal{G}$ consists of all possible groups (subsets) in the data, and (b) $\mathcal{G}$ is more streamlined. For the first setting, which is akin to groups being completely unknown, we devise the {\sc PF} (Proportional Fairness) classifier, which guarantees, on any possible group $g$, an accuracy that is proportional to that of the optimal classifier for $g$, scaled by the relative size of $g$ in the data set. Due to including all possible groups, some of which could be too complex to be relevant, the worst-case theoretical guarantees here have to be proportionally weaker for smaller subsets. For the second setting, we devise the {\sc BeFair} (Best-effort Fair) framework which seeks an accuracy, on every $g \in \mathcal{G}$, which approximates that of the optimal classifier on $g$, independent of the size of $g$. Aiming for such a guarantee results in a non-convex problem, and we design novel techniques to get around this difficulty when $\mathcal{G}$ is the set of linear hypotheses. We test our algorithms on real-world data sets, and present interesting comparative insights on their performance.

下载PDF全文

下载文献需遵守相关版权规定

论文标题