没有嘈杂标签的遗憾样本选择

论文标题

没有嘈杂标签的遗憾样本选择

No Regret Sample Selection with Noisy Labels

论文作者

Song, H., Mitsuo, N., Uchida, S., Suehiro, D.

论文摘要

由于过度拟合的风险，深度神经网络（DNNS）遭受了嘈杂的标记数据。为了避免这种风险，在本文中，我们提出了一种基于自适应K-set选择样品选择的新型DNN训练方法，该方法选择了每个时期的整个N噪声训练样品中的K（<n）清洁样品候选者。它具有确保理论上选择的性能的强大优势。粗略地说，一种遗憾是由实际选择和最佳选择之间的差异所定义的，理论上是有界限的，即使在所有时代结束之前，最好的选择都是未知的。多个嘈杂标记的数据集的实验结果表明，我们的样本选择策略在DNN培训中有效地工作。实际上，所提出的方法在最先进的方法中达到了最佳或第二好的性能，同时需要大大降低计算成本。该代码可从https://github.com/songheony/taks获得。

Deep neural networks (DNNs) suffer from noisy-labeled data because of the risk of overfitting. To avoid the risk, in this paper, we propose a novel DNN training method with sample selection based on adaptive k-set selection, which selects k (< n) clean sample candidates from the whole n noisy training samples at each epoch. It has a strong advantage of guaranteeing the performance of the selection theoretically. Roughly speaking, a regret, which is defined by the difference between the actual selection and the best selection, of the proposed method is theoretically bounded, even though the best selection is unknown until the end of all epochs. The experimental results on multiple noisy-labeled datasets demonstrate that our sample selection strategy works effectively in the DNN training; in fact, the proposed method achieved the best or the second-best performance among state-of-the-art methods, while requiring a significantly lower computational cost. The code is available at https://github.com/songheony/TAkS.

下载PDF全文

下载文献需遵守相关版权规定

论文标题