论文标题
神经网络中的抽样预测匹配示例:一种概率编程方法
Sampling Prediction-Matching Examples in Neural Networks: A Probabilistic Programming Approach
论文作者
论文摘要
尽管神经网络模型表现出令人印象深刻的性能,但我们并不完全理解这些黑框模型如何做出个人预测。这一缺点导致了大量研究,以了解这些模型,例如鲁棒性,可解释性和泛化能力。在本文中,我们考虑了使用概率编程探索分类器的预测水平集的问题。我们将预测级别定义为一组示例,相对于某些任意数据分布,预测器具有相同指定的预测置信度。值得注意的是,我们的基于抽样的方法不需要分类器可区分,从而使其与任意分类器兼容。作为特定的实例化,如果我们将分类器作为神经网络而将数据分布作为培训数据的数据分布,我们可以获得将导致神经网络指定预测的示例。我们通过在合成数据集和MNIST上进行实验来证明这一技术。分类中的这种水平集可能有助于人类对分类行为的理解。
Though neural network models demonstrate impressive performance, we do not understand exactly how these black-box models make individual predictions. This drawback has led to substantial research devoted to understand these models in areas such as robustness, interpretability, and generalization ability. In this paper, we consider the problem of exploring the prediction level sets of a classifier using probabilistic programming. We define a prediction level set to be the set of examples for which the predictor has the same specified prediction confidence with respect to some arbitrary data distribution. Notably, our sampling-based method does not require the classifier to be differentiable, making it compatible with arbitrary classifiers. As a specific instantiation, if we take the classifier to be a neural network and the data distribution to be that of the training data, we can obtain examples that will result in specified predictions by the neural network. We demonstrate this technique with experiments on a synthetic dataset and MNIST. Such level sets in classification may facilitate human understanding of classification behaviors.