论文标题
重新思考基于自动编码器的重建外分布检测
Rethinking Reconstruction Autoencoder-Based Out-of-Distribution Detection
论文作者
论文摘要
在某些情况下,分类器需要检测远离其培训数据的分布样本。具有理想的特征,基于重建自动编码器的方法通过使用输入重建误差作为新颖性与正态性来解决此问题。我们制定了具有固有偏见的四倍体域翻译等方法的本质,以查询有条件数据不确定性的代理。因此,改进方向被形式化为最大程度地压缩自动编码器的潜在空间,同时确保其重建功率以充当描述的域翻译器。从中,引入了策略,包括语义重建,数据确定性分解和标准化的L2距离,以实质上改善原始方法,这些方法共同在各种基准测试中建立了最新性能,例如,CIFAR-100的FPR@95%TPR vs. Tinyimagenet-crop in wideimagenet-crop on widemagenet-crop in 0.2%。重要的是,我们的方法无需任何其他数据,难以实现的结构,耗时的管道,甚至损害已知类别的分类准确性。
In some scenarios, classifier requires detecting out-of-distribution samples far from its training data. With desirable characteristics, reconstruction autoencoder-based methods deal with this problem by using input reconstruction error as a metric of novelty vs. normality. We formulate the essence of such approach as a quadruplet domain translation with an intrinsic bias to only query for a proxy of conditional data uncertainty. Accordingly, an improvement direction is formalized as maximumly compressing the autoencoder's latent space while ensuring its reconstructive power for acting as a described domain translator. From it, strategies are introduced including semantic reconstruction, data certainty decomposition and normalized L2 distance to substantially improve original methods, which together establish state-of-the-art performance on various benchmarks, e.g., the FPR@95%TPR of CIFAR-100 vs. TinyImagenet-crop on Wide-ResNet is 0.2%. Importantly, our method works without any additional data, hard-to-implement structure, time-consuming pipeline, and even harming the classification accuracy of known classes.