胶片征：通过特征线性调制的概率深度学习

论文标题

胶片征：通过特征线性调制的概率深度学习

FiLM-Ensemble: Probabilistic Deep Learning via Feature-wise Linear Modulation

论文作者

Turkoglu, Mehmet Ozgur, Becker, Alexander, Gündüz, Hüseyin Anil, Rezaei, Mina, Bischl, Bernd, Daudt, Rodrigo Caye, D'Aronco, Stefano, Wegner, Jan Dirk, Schindler, Konrad

论文摘要

在现实世界中部署机器学习时，估计认知不确定性的能力通常至关重要，但是现代方法通常会产生过度自信，未校准的不确定性预测。量化认知不确定性的一种常见方法，可在广泛的预测模型中使用，是训练模型集合。在幼稚的实施中，整体方法具有较高的计算成本和高内存需求。这挑战尤其是现代深度学习，即使是一个深层网络也已经在计算和内存方面都要求，并且已经引起了许多尝试模拟模型合奏的尝试，而无需实际实例化单独的合奏成员。我们介绍了基于特征线性调制的概念（膜）的概念，这是一种深层，隐性的集合方法。该技术最初是为多任务学习而开发的，其目的是将不同的任务解耦。我们表明，该想法可以扩展到不确定性量化：通过调节单个深网的网络激活，通过膜，获得了具有高度多样性的模型集合，因此对认知不确定性的估计值进行了良好的估计，相比之下。从经验上讲，胶片 - 征服的表现优于其他隐式集合方法，并且它非常接近于网络集合的上限（有时甚至是击败它），这是记忆成本的一小部分。

The ability to estimate epistemic uncertainty is often crucial when deploying machine learning in the real world, but modern methods often produce overconfident, uncalibrated uncertainty predictions. A common approach to quantify epistemic uncertainty, usable across a wide class of prediction models, is to train a model ensemble. In a naive implementation, the ensemble approach has high computational cost and high memory demand. This challenges in particular modern deep learning, where even a single deep network is already demanding in terms of compute and memory, and has given rise to a number of attempts to emulate the model ensemble without actually instantiating separate ensemble members. We introduce FiLM-Ensemble, a deep, implicit ensemble method based on the concept of Feature-wise Linear Modulation (FiLM). That technique was originally developed for multi-task learning, with the aim of decoupling different tasks. We show that the idea can be extended to uncertainty quantification: by modulating the network activations of a single deep network with FiLM, one obtains a model ensemble with high diversity, and consequently well-calibrated estimates of epistemic uncertainty, with low computational overhead in comparison. Empirically, FiLM-Ensemble outperforms other implicit ensemble methods, and it and comes very close to the upper bound of an explicit ensemble of networks (sometimes even beating it), at a fraction of the memory cost.

下载PDF全文

下载文献需遵守相关版权规定

论文标题