论文标题

在公平限制下选择多种数据选择

Diverse Data Selection under Fairness Constraints

论文作者

Moumoulidou, Zafeiria, McGregor, Andrew, Meliou, Alexandra

论文摘要

多样性是数据选择和汇总,设施位置和推荐系统的重要原则。我们的工作着重于最大程度地提高数据选择的多样性,同时提供公平性保证。特别是,我们提供的第一项研究以公平的限制增强了最大最大多元化目标。更具体地说,鉴于可以将$ n $元素的宇宙$ u $ u $ $ n $元素划分为$ m $分离组,我们的目标是检索一个$ k $ sized的子集,该子集可最大化集合(多样性)中的成对最小距离,并包含预先指定的$ k_i $ k_i $ $ $ $ $ $ $ $ k_i。我们表明,即使在公制空间中,这个问题也是NP的完整,我们提出了三种新颖的算法,即$ n $的线性,这些算法可为$ m $和$ k $的不同值提供强大的理论近似保证。最后,我们将算法和分析扩展到可以重叠的情况下。

Diversity is an important principle in data selection and summarization, facility location, and recommendation systems. Our work focuses on maximizing diversity in data selection, while offering fairness guarantees. In particular, we offer the first study that augments the Max-Min diversification objective with fairness constraints. More specifically, given a universe $U$ of $n$ elements that can be partitioned into $m$ disjoint groups, we aim to retrieve a $k$-sized subset that maximizes the pairwise minimum distance within the set (diversity) and contains a pre-specified $k_i$ number of elements from each group $i$ (fairness). We show that this problem is NP-complete even in metric spaces, and we propose three novel algorithms, linear in $n$, that provide strong theoretical approximation guarantees for different values of $m$ and $k$. Finally, we extend our algorithms and analysis to the case where groups can be overlapping.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源