论文标题
激励联盟学习中数据共享的机制
Mechanisms that Incentivize Data Sharing in Federated Learning
论文作者
论文摘要
通常,联盟学习通常被认为是一种有益的技术,它允许多个代理人相互协作,提高模型的准确性,并解决这些问题,这些问题否则这些问题是数据密集型 /昂贵而无法单独解决的。但是,在预期其他代理商将共享其数据的情况下,理性的代理可能会很想从事有害行为,例如自由骑行的行为,他们在哪里贡献了数据,但仍然享有改进的模型。在这项工作中,我们提出了一个框架来分析此类理性数据生成器的行为。我们首先展示了一种幼稚的方案如何导致灾难性的自由骑行水平,其中数据共享的好处被完全侵蚀。然后,使用合同理论中的想法,我们介绍基于准确性的机制,以最大程度地提高每个代理产生的数据量。事实证明,这些无需任何付款机制就可以防止自由骑行。
Federated learning is typically considered a beneficial technology which allows multiple agents to collaborate with each other, improve the accuracy of their models, and solve problems which are otherwise too data-intensive / expensive to be solved individually. However, under the expectation that other agents will share their data, rational agents may be tempted to engage in detrimental behavior such as free-riding where they contribute no data but still enjoy an improved model. In this work, we propose a framework to analyze the behavior of such rational data generators. We first show how a naive scheme leads to catastrophic levels of free-riding where the benefits of data sharing are completely eroded. Then, using ideas from contract theory, we introduce accuracy shaping based mechanisms to maximize the amount of data generated by each agent. These provably prevent free-riding without needing any payment mechanism.