通过局部自我重新培训，朝着与嘈杂标签的联合学习

论文标题

通过局部自我重新培训，朝着与嘈杂标签的联合学习

Towards Federated Learning against Noisy Labels via Local Self-Regularization

论文作者

Jiang, Xuefeng, Sun, Sheng, Wang, Yuwei, Liu, Min

论文摘要

联邦学习（FL）旨在以隐私的方式从大规模的分散设备中学习联合知识。但是，由于高质量的标记数据需要昂贵的人类智能和努力，因此标签不正确的数据（称为嘈杂标签）实际上无处不在，这不可避免地导致性能退化。尽管提出了许多直接处理嘈杂标签的方法，但这些方法要么需要过多的计算开销，要么违反了FL的隐私保护原则。为此，我们将重点放在FL上，目的是减轻嘈杂标签所产生的性能退化，同时保证数据隐私。具体而言，我们提出了一种局部自我调节方法，该方法通过隐式阻碍模型记忆噪声标签并明确缩小使用自我蒸馏之间的原始实例和增强实例之间的模型输出差异，从而有效地规范了局部训练过程。实验结果表明，我们提出的方法可以在三个基准数据集上的各种噪声水平中获得明显的抵抗力。此外，我们将方法与现有的最新方法集成在一起，并在实际数据集服装上实现卓越的性能1M。该代码可在https://github.com/sprinter1999/fedlsr上找到。

Federated learning (FL) aims to learn joint knowledge from a large scale of decentralized devices with labeled data in a privacy-preserving manner. However, since high-quality labeled data require expensive human intelligence and efforts, data with incorrect labels (called noisy labels) are ubiquitous in reality, which inevitably cause performance degradation. Although a lot of methods are proposed to directly deal with noisy labels, these methods either require excessive computation overhead or violate the privacy protection principle of FL. To this end, we focus on this issue in FL with the purpose of alleviating performance degradation yielded by noisy labels meanwhile guaranteeing data privacy. Specifically, we propose a Local Self-Regularization method, which effectively regularizes the local training process via implicitly hindering the model from memorizing noisy labels and explicitly narrowing the model output discrepancy between original and augmented instances using self distillation. Experimental results demonstrate that our proposed method can achieve notable resistance against noisy labels in various noise levels on three benchmark datasets. In addition, we integrate our method with existing state-of-the-arts and achieve superior performance on the real-world dataset Clothing1M. The code is available at https://github.com/Sprinter1999/FedLSR.

下载PDF全文

下载文献需遵守相关版权规定

论文标题