论文标题
关于可扩展信息瓶颈的相关性复杂区域
On the Relevance-Complexity Region of Scalable Information Bottleneck
论文作者
论文摘要
信息瓶颈方法是一种学习技术,通过在压缩复杂性之间进行适当的权衡,通过最小描述长度来衡量,并在对数损失度量下评估的失真之间寻求适当的平衡和概括能力之间的平衡。在本文中,我们研究了该问题的变化,称为可扩展信息瓶颈,其中编码器以越来越丰富的特征对观察结果进行多个描述。手头的问题是由于某些应用程序场景所激发的,这些方案需要根据允许的概括水平而需要变化的准确性。首先,我们为无内存的高斯源和无内存的二进制源建立了相关复杂区域的显式(分析)特征。然后,我们得出了一种Blahut-Arimoto型算法,该算法使我们能够计算(近似)的一般离散源区域。最后,提供了模式分类问题中的应用程序示例以及数值结果。
The Information Bottleneck method is a learning technique that seeks a right balance between accuracy and generalization capability through a suitable tradeoff between compression complexity, measured by minimum description length, and distortion evaluated under logarithmic loss measure. In this paper, we study a variation of the problem, called scalable information bottleneck, where the encoder outputs multiple descriptions of the observation with increasingly richer features. The problem at hand is motivated by some application scenarios that require varying levels of accuracy depending on the allowed level of generalization. First, we establish explicit (analytic) characterizations of the relevance-complexity region for memoryless Gaussian sources and memoryless binary sources. Then, we derive a Blahut-Arimoto type algorithm that allows us to compute (an approximation of) the region for general discrete sources. Finally, an application example in the pattern classification problem is provided along with numerical results.