使用基于置信的筛分策略标记噪声般的学习

论文标题

使用基于置信的筛分策略标记噪声般的学习

Label Noise-Robust Learning using a Confidence-Based Sieving Strategy

论文作者

Torkzadehmahani, Reihaneh, Nasirigerdeh, Reza, Rueckert, Daniel, Kaissis, Georgios

论文摘要

在使用标签噪声的学习任务时，改善模型鲁棒性抵抗过度拟合是一个关键的挑战，因为该模型最终会记住标签，包括嘈杂的标签。用嘈杂的标签识别样品并阻止模型学习它们是应对这一挑战的一种有希望的方法。当使用嘈杂的标签训练时，以班级概率为代表的模型的每类置信度得分可以是评估输入标签是真实标签还是损坏的标准。在这项工作中，我们利用了这一观察结果，并提出了一种新颖的歧视指标，称为置信误差，并提出了一种称为cond的筛选策略，以有效地区分清洁和嘈杂的样本。我们为提议的度量标准提供了理论保证。然后，我们在实验上说明了我们所提出的方法的出色性能，而不是对各种环境（例如合成和现实标签噪声）的最新研究。此外，我们表明会议可以与其他最先进的方法相结合，例如共同教学和DivideMix，以进一步提高模型性能。

In learning tasks with label noise, improving model robustness against overfitting is a pivotal challenge because the model eventually memorizes labels, including the noisy ones. Identifying the samples with noisy labels and preventing the model from learning them is a promising approach to address this challenge. When training with noisy labels, the per-class confidence scores of the model, represented by the class probabilities, can be reliable criteria for assessing whether the input label is the true label or the corrupted one. In this work, we exploit this observation and propose a novel discriminator metric called confidence error and a sieving strategy called CONFES to differentiate between the clean and noisy samples effectively. We provide theoretical guarantees on the probability of error for our proposed metric. Then, we experimentally illustrate the superior performance of our proposed approach compared to recent studies on various settings, such as synthetic and real-world label noise. Moreover, we show CONFES can be combined with other state-of-the-art approaches, such as Co-teaching and DivideMix to further improve model performance.

下载PDF全文

下载文献需遵守相关版权规定

论文标题