基于T分布的操作员，用于增强神经网络分类器的分布鲁棒性

论文标题

基于T分布的操作员，用于增强神经网络分类器的分布鲁棒性

A t-distribution based operator for enhancing out of distribution robustness of neural network classifiers

论文作者

Antonello, Niccolò, Garner, Philip N.

论文摘要

神经网络（NN）分类器可以将极端概率分配给训练期间未出现的样本（分布式样本），从而产生错误和不可靠的预测。这种不必要的行为的原因之一在于使用标准软磁性操作员，该操作员将后验概率推为零或统一，因此无法对不确定性进行建模。 SoftMax运算符的统计推导依赖于以下假设：给定类别的潜在变量的分布是具有已知方差的高斯。但是，可以在同一推导中使用不同的假设，并从其他分布家庭中获得。这允许推导具有更有利特性的新型操作员。在这里，提出了一个新型操作员，该操作员使用$ t $ - 分布来得出，该分布能够更好地描述不确定性。结果表明，采用这种新型操作员的分类器可以从分发样本中更强大，通常优于使用标准SoftMax运算符的NNS。通过对NN体系结构的最小变化，可以达到这些增强功能。

Neural Network (NN) classifiers can assign extreme probabilities to samples that have not appeared during training (out-of-distribution samples) resulting in erroneous and unreliable predictions. One of the causes for this unwanted behaviour lies in the use of the standard softmax operator which pushes the posterior probabilities to be either zero or unity hence failing to model uncertainty. The statistical derivation of the softmax operator relies on the assumption that the distributions of the latent variables for a given class are Gaussian with known variance. However, it is possible to use different assumptions in the same derivation and attain from other families of distributions as well. This allows derivation of novel operators with more favourable properties. Here, a novel operator is proposed that is derived using $t$-distributions which are capable of providing a better description of uncertainty. It is shown that classifiers that adopt this novel operator can be more robust to out of distribution samples, often outperforming NNs that use the standard softmax operator. These enhancements can be reached with minimal changes to the NN architecture.

下载PDF全文

下载文献需遵守相关版权规定

论文标题