论文标题
对具有快速对抗性校准的深神经网络的信任预测
Towards Trustworthy Predictions from Deep Neural Networks with Fast Adversarial Calibration
论文作者
论文摘要
为了促进对现实世界应用中的AI系统指导决策的广泛接受,部署模型的可信度是关键。也就是说,对于预测模型是不确定性感知并产生对内域样品以及域移动下的良好校准(并因此值得信赖的)预测至关重要的。解决预测性不确定性的最新努力包括训练有素的神经网络,贝叶斯神经网络以及替代性的非拜访方法(例如合奏方法和证据深度学习)的后处理步骤。在这里,我们提出了一种有效但通用的建模方法,用于为域移位后获得的样品获得良好的,可信赖的概率。我们介绍了一种新的训练策略,该策略结合了熵式损失项和对抗性校准损失项,并证明这会导致对广泛域漂移的良好校准和技术值得信赖的预测。我们全面评估了先前提出的方法对不同的数据模式,包括序列数据,网络体系结构和扰动策略在内的大量数据集。我们观察到,我们的建模方法基本上要优于现有的最新方法,从而在域漂移下产生了良好的预测。
To facilitate a wide-spread acceptance of AI systems guiding decision making in real-world applications, trustworthiness of deployed models is key. That is, it is crucial for predictive models to be uncertainty-aware and yield well-calibrated (and thus trustworthy) predictions for both in-domain samples as well as under domain shift. Recent efforts to account for predictive uncertainty include post-processing steps for trained neural networks, Bayesian neural networks as well as alternative non-Bayesian approaches such as ensemble approaches and evidential deep learning. Here, we propose an efficient yet general modelling approach for obtaining well-calibrated, trustworthy probabilities for samples obtained after a domain shift. We introduce a new training strategy combining an entropy-encouraging loss term with an adversarial calibration loss term and demonstrate that this results in well-calibrated and technically trustworthy predictions for a wide range of domain drifts. We comprehensively evaluate previously proposed approaches on different data modalities, a large range of data sets including sequence data, network architectures and perturbation strategies. We observe that our modelling approach substantially outperforms existing state-of-the-art approaches, yielding well-calibrated predictions under domain drift.