通过反事实解释增强 - 修复过度自信的分类器

论文标题

通过反事实解释增强 - 修复过度自信的分类器

Augmentation by Counterfactual Explanation -- Fixing an Overconfident Classifier

论文作者

Singla, Sumedha, Murali, Nihal, Arabshahi, Forough, Triantafyllou, Sofia, Batmanghelich, Kayhan

论文摘要

高度准确但过度自信的模型不适合在医疗保健和自动驾驶等关键应用中部署。分类结果应反映出接近决策边界的模棱两可的分布样本的高度不确定性。该模型还应避免对远远超出其训练分布，遥远分布（Far-od）或从培训分布（近乎旧）的新颖班级的样本中做出过度自信的决策。本文提出了反事实解释的应用，以修复过度自信的分类器。具体而言，我们建议使用反事实解释器（ACE）的增强来微调给定的预训练的分类器，以确定其不确定性特征，同时保留其预测性能。我们通过检测Far-ood，近ood和模棱两可的样品进行广泛的实验。我们的经验结果表明，修订后的模型改善了不确定性度量，其性能与最新方法具有竞争力。

A highly accurate but overconfident model is ill-suited for deployment in critical applications such as healthcare and autonomous driving. The classification outcome should reflect a high uncertainty on ambiguous in-distribution samples that lie close to the decision boundary. The model should also refrain from making overconfident decisions on samples that lie far outside its training distribution, far-out-of-distribution (far-OOD), or on unseen samples from novel classes that lie near its training distribution (near-OOD). This paper proposes an application of counterfactual explanations in fixing an over-confident classifier. Specifically, we propose to fine-tune a given pre-trained classifier using augmentations from a counterfactual explainer (ACE) to fix its uncertainty characteristics while retaining its predictive performance. We perform extensive experiments with detecting far-OOD, near-OOD, and ambiguous samples. Our empirical results show that the revised model have improved uncertainty measures, and its performance is competitive to the state-of-the-art methods.

下载PDF全文

下载文献需遵守相关版权规定

论文标题