利用班级分离距离来评估机器学习分类器的腐败鲁棒性

论文标题

利用班级分离距离来评估机器学习分类器的腐败鲁棒性

Utilizing Class Separation Distance for the Evaluation of Corruption Robustness of Machine Learning Classifiers

论文作者

Siedel, Georg, Vock, Silvia, Morozov, Andrey, Voß, Stefan

论文摘要

鲁棒性是机器学习（ML）分类器的基本支柱，实质上确定了它们的可靠性。因此，评估分类器鲁棒性的方法至关重要。在这项工作中，我们解决了评估腐败鲁棒性的挑战，以允许在给定数据集上的可比性和解释性。我们提出了一种测试数据增强方法，该方法使用稳健性距离$ε$从数据集中衍生的最小类别分离距离。由此产生的MSCR（平均统计损坏鲁棒性）允许对不同分类器在腐败鲁棒性方面进行特定于数据集的比较。 MSCR值是可以解释的，因为它代表了由于统计损坏而避免了准确性损失的分类器。在2D和图像数据上，我们显示度量反映了分类器鲁棒性的不同级别。此外，我们通过训练和测试不同级别的噪声测试分类器观察到分类器中意外的最佳精度。虽然研究人员经常在训练健壮的模型时经常报道准确性的重大权衡，但我们加强了这样一种观点，即准确性和腐败鲁棒性之间的权衡并不是固有的。我们的结果表明，通过简单数据增强的鲁棒性训练已经可以稍微提高准确性。

Robustness is a fundamental pillar of Machine Learning (ML) classifiers, substantially determining their reliability. Methods for assessing classifier robustness are therefore essential. In this work, we address the challenge of evaluating corruption robustness in a way that allows comparability and interpretability on a given dataset. We propose a test data augmentation method that uses a robustness distance $ε$ derived from the datasets minimal class separation distance. The resulting MSCR (mean statistical corruption robustness) metric allows a dataset-specific comparison of different classifiers with respect to their corruption robustness. The MSCR value is interpretable, as it represents the classifiers avoidable loss of accuracy due to statistical corruptions. On 2D and image data, we show that the metric reflects different levels of classifier robustness. Furthermore, we observe unexpected optima in classifiers robust accuracy through training and testing classifiers with different levels of noise. While researchers have frequently reported on a significant tradeoff on accuracy when training robust models, we strengthen the view that a tradeoff between accuracy and corruption robustness is not inherent. Our results indicate that robustness training through simple data augmentation can already slightly improve accuracy.

下载PDF全文

下载文献需遵守相关版权规定

论文标题