改进了公平相关群集的近似

论文标题

改进了公平相关群集的近似

Improved Approximation for Fair Correlation Clustering

论文作者

Ahmadian, Sara, Negahbani, Maryam

论文摘要

相关聚类是无处不在的机器学习中无处不在的范式，在这种学习中解决不公平是一个主要的挑战。在此激励的情况下，我们研究了数据点可能属于不同保护组的公平相关聚类，目标是确保跨群集的所有组公平代表。我们的论文显着概括并改善了艾哈迈迪等人先前工作的质量保证。和Ahmadian等。如下。 - 我们允许用户在集群中每个组的表示上指定一个任意上限。 - 我们的算法允许个人具有多个受保护的特征，并确保所有这些特征同时公平。 - 我们证明，在这种一般环境中，可以保证质量和公平性。此外，这改善了先前工作中研究的特殊情况的结果。我们对现实世界数据的实验表明，与最佳解决方案相比，我们的聚类质量要比理论结果所建议的要好得多。

Correlation clustering is a ubiquitous paradigm in unsupervised machine learning where addressing unfairness is a major challenge. Motivated by this, we study Fair Correlation Clustering where the data points may belong to different protected groups and the goal is to ensure fair representation of all groups across clusters. Our paper significantly generalizes and improves on the quality guarantees of previous work of Ahmadi et al. and Ahmadian et al. as follows. - We allow the user to specify an arbitrary upper bound on the representation of each group in a cluster. - Our algorithm allows individuals to have multiple protected features and ensure fairness simultaneously across them all. - We prove guarantees for clustering quality and fairness in this general setting. Furthermore, this improves on the results for the special cases studied in previous work. Our experiments on real-world data demonstrate that our clustering quality compared to the optimal solution is much better than what our theoretical result suggests.

下载PDF全文

下载文献需遵守相关版权规定

论文标题