论文标题

非二元和时间随机块模型中的社区恢复

Community recovery in non-binary and temporal stochastic block models

论文作者

Avrachenkov, Konstantin, Dreveton, Maximilien, Leskelä, Lasse

论文摘要

本文研究了从$ n $节点网络中成对相互作用的潜在社区成员资格的估计,其中观察到的交互可能是任意类型的,包括二进制,分类和矢量价值,而不排除更一般的对象,例如时间序列或空间点模式。作为此类数据的生成模型,我们引入了一个随机块模型,其中具有一般可测量的交互空间$ \ MATHCAL S $,为此我们为最低可达到的错误率提供了信息理论界限。这些边界在数据稀疏性,内部和块间相互作用分布之间的统计相似性以及相互作用空间的形状和大小方面产生了尖锐的标准。一般的框架使得在$ n \ to \ infty $和$ t \ to \ infty $的设置中,使用$ \ Mathcal S = \ {0,1 \}^T $研究时间和多重网络成为可能,并且时间交互模式随着时间的流逝而相关。对于暂时的马尔可夫相互作用,我们得出了尖锐的一致性阈值。我们还提出了快速的在线估计算法,该算法充分利用了观察到的数据的非二元性质。关于合成和真实数据的数值实验表明,这些算法即使对于非常稀疏的数据阵列,这些算法也会迅速产生准确的估计值。

This article studies the estimation of latent community memberships from pairwise interactions in a network of $N$ nodes, where the observed interactions can be of arbitrary type, including binary, categorical, and vector-valued, and not excluding even more general objects such as time series or spatial point patterns. As a generative model for such data, we introduce a stochastic block model with a general measurable interaction space $\mathcal S$, for which we derive information-theoretic bounds for the minimum achievable error rate. These bounds yield sharp criteria for the existence of consistent and strongly consistent estimators in terms of data sparsity, statistical similarity between intra- and inter-block interaction distributions, and the shape and size of the interaction space. The general framework makes it possible to study temporal and multiplex networks with $\mathcal S = \{0,1\}^T$, in settings where both $N \to \infty$ and $T \to \infty$, and the temporal interaction patterns are correlated over time. For temporal Markov interactions, we derive sharp consistency thresholds. We also present fast online estimation algorithms which fully utilise the non-binary nature of the observed data. Numerical experiments on synthetic and real data show that these algorithms rapidly produce accurate estimates even for very sparse data arrays.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源