从以人为中心的角度重新思考分布式检测

论文标题

从以人为中心的角度重新思考分布式检测

Rethinking Out-of-Distribution Detection From a Human-Centric Perspective

论文作者

Zhu, Yao, Chen, Yuefeng, Li, Xiaodan, Zhang, Rong, Xue, Hui, Tian, Xiang, Jiang, Rongxin, Zheng, Bolun, Chen, Yaowu

论文摘要

多年来，分布外（OOD）的检测受到广泛关注，旨在通过拒绝错误的预测来确保在现实世界中深处神经网络（DNN）的可靠性和安全性。但是，我们注意到常规评估与OOD检测的基本目的之间存在差异。一方面，常规评估专门考虑由标签空间分布变化引起的风险，同时忽略输入空间分布变化的风险。另一方面，传统评估奖励检测方法是为了不拒绝验证数据集中错误分类的图像。但是，错误分类的图像也可能导致风险，应拒绝。我们呼吁从以人为中心的角度重新考虑OOD检测，正确的检测方法应拒绝以下案例，即深层模型的预测与人类的期望不匹配，并采用了深层模型的预测符合人类期望的案例。我们提出了以人为本的评估，并对45个分类器和8个测试数据集进行了广泛的实验。我们发现，与最近提出的方法相比，简单的基线OOD检测方法可以实现可比性甚至更好的性能，这意味着过去几年中OOD检测的发展可能会被高估。此外，我们的实验表明，模型选择是对OOD检测的不平凡性，应被视为所提出方法的组成部分，这与现有作品中提出的方法在不同模型之间具有普遍性的主张不同。

Out-Of-Distribution (OOD) detection has received broad attention over the years, aiming to ensure the reliability and safety of deep neural networks (DNNs) in real-world scenarios by rejecting incorrect predictions. However, we notice a discrepancy between the conventional evaluation vs. the essential purpose of OOD detection. On the one hand, the conventional evaluation exclusively considers risks caused by label-space distribution shifts while ignoring the risks from input-space distribution shifts. On the other hand, the conventional evaluation reward detection methods for not rejecting the misclassified image in the validation dataset. However, the misclassified image can also cause risks and should be rejected. We appeal to rethink OOD detection from a human-centric perspective, that a proper detection method should reject the case that the deep model's prediction mismatches the human expectations and adopt the case that the deep model's prediction meets the human expectations. We propose a human-centric evaluation and conduct extensive experiments on 45 classifiers and 8 test datasets. We find that the simple baseline OOD detection method can achieve comparable and even better performance than the recently proposed methods, which means that the development in OOD detection in the past years may be overestimated. Additionally, our experiments demonstrate that model selection is non-trivial for OOD detection and should be considered as an integral of the proposed method, which differs from the claim in existing works that proposed methods are universal across different models.

下载PDF全文

下载文献需遵守相关版权规定

论文标题