图层堆栈温度缩放

论文标题

图层堆栈温度缩放

Layer-Stack Temperature Scaling

论文作者

Khalifa, Amr, Mozer, Michael C., Sedghi, Hanie, Neyshabur, Behnam, Alabdulmohsin, Ibrahim

论文摘要

最近的作品表明，神经网络中的早期层包含有用的预测信息。受此启发，我们表明，在所有层上扩展温度缩放都可以提高校准和准确性。我们称此过程为“层堆栈温度缩放”（LATE）。非正式地，LATE在推断期间对每一层进行加权投票。我们将其评估在五个流行的卷积神经网络体系结构内和分布外，并在准确性，校准和AUC方面观察到对温度缩放的一致改善。所有结论均由全面的统计分析支持。由于LATES既没有重新培训架构，也没有引入更多参数，因此它的优点可以被收获，而无需超出温度缩放中使用的数据。最后，我们表明，将LATE与CIFAR10/100上的最先进的结果结合在一起。

Recent works demonstrate that early layers in a neural network contain useful information for prediction. Inspired by this, we show that extending temperature scaling across all layers improves both calibration and accuracy. We call this procedure "layer-stack temperature scaling" (LATES). Informally, LATES grants each layer a weighted vote during inference. We evaluate it on five popular convolutional neural network architectures both in- and out-of-distribution and observe a consistent improvement over temperature scaling in terms of accuracy, calibration, and AUC. All conclusions are supported by comprehensive statistical analyses. Since LATES neither retrains the architecture nor introduces many more parameters, its advantages can be reaped without requiring additional data beyond what is used in temperature scaling. Finally, we show that combining LATES with Monte Carlo Dropout matches state-of-the-art results on CIFAR10/100.

下载PDF全文

下载文献需遵守相关版权规定

论文标题