论文标题

研究互助信息的替代措施

Investigation of Alternative Measures for Mutual Information

论文作者

Kuskonmaz, Bulut, Gundersen, Jaron Skovsted, Wisniewski, Rafal

论文摘要

共同信息$ i(x; y)$在信息理论中是一个有用的定义,可以估计随机变量$ x $的随机变量$ y $所包含多少信息。定义共同信息的一种方法是将$ x $和$ y $的联合分布与通过KL-Divergence的产品进行比较。如果两个分布彼此接近,则几乎不会从$ y $ $ x $泄漏,因为这两个变量接近独立。在离散的设置中,共同信息对多少位$ y $显示了大约$ x $,如果$ i(x; y)= h(x)$($ x $的香农熵),则完全揭示了$ x $。但是,在连续的情况下,我们没有相同的推理。例如,在连续情况下可以无限信息。这个事实使我们能够尝试不同的指标或差异来定义相互信息。在本文中,我们正在评估不同的指标或差异,例如Kullback-Liebler(KL)差异,Wasserstein距离,Jensen-Shannon Divergence和总变化距离,以形成连续情况下相互信息的替代方案。我们部署不同的方法来估计或绑定这些指标和差异,并评估其性能。

Mutual information $I(X;Y)$ is a useful definition in information theory to estimate how much information the random variable $Y$ holds about the random variable $X$. One way to define the mutual information is by comparing the joint distribution of $X$ and $Y$ with the product of the marginals through the KL-divergence. If the two distributions are close to each other there will be almost no leakage of $X$ from $Y$ since the two variables are close to being independent. In the discrete setting the mutual information has the nice interpretation of how many bits $Y$ reveals about $X$ and if $I(X;Y)=H(X)$ (the Shannon entropy of $X$) then $X$ is completely revealed. However, in the continuous case we do not have the same reasoning. For instance the mutual information can be infinite in the continuous case. This fact enables us to try different metrics or divergences to define the mutual information. In this paper, we are evaluating different metrics or divergences such as Kullback-Liebler (KL) divergence, Wasserstein distance, Jensen-Shannon divergence and total variation distance to form alternatives to the mutual information in the continuous case. We deploy different methods to estimate or bound these metrics and divergences and evaluate their performances.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源