基于内核的近似贝叶斯推断指数家庭随机图模型

论文标题

基于内核的近似贝叶斯推断指数家庭随机图模型

Kernel-based Approximate Bayesian Inference for Exponential Family Random Graph Models

论文作者

Yin, Fan, Butts, Carter T.

论文摘要

贝叶斯对指数家族随机图模型（ERGM）的推断是一个双重缩减的问题，因为可能性和后验正常化因子的可行性。此问题的基于辅助变量的马尔可夫链蒙特卡洛（MCMC）方法在计算上是渐进的，但很难扩展到修改的ERGM家族。在这项工作中，我们提出了一种基于内核的近似贝叶斯计算算法，用于拟合ERGMS。通过采用自适应重要性抽样技术，我们大大提高了采样步骤的效率。尽管大约是近似的，但我们易于并行的方法是产量与最先进的方法可比的精度可比，并在多核硬件上的计算时间有了很大的改善。我们的方法还灵活地适应了算法增强功能（包括改善的学习算法以估算条件期望）和扩展到非标准案例（例如来自非充分统计的推论）。我们证明了这种方法在两个众所周知的网络数据集上的性能，将其准确性和效率与使用近似交换算法获得的结果进行了比较。我们的测试表明，使用五个内核的壁挂时间优势高达50％，并且能够以30个核心以1/5的时间拟合模型；当有更多核心可用时，可以进一步提高速度。

Bayesian inference for exponential family random graph models (ERGMs) is a doubly-intractable problem because of the intractability of both the likelihood and posterior normalizing factor. Auxiliary variable based Markov Chain Monte Carlo (MCMC) methods for this problem are asymptotically exact but computationally demanding, and are difficult to extend to modified ERGM families. In this work, we propose a kernel-based approximate Bayesian computation algorithm for fitting ERGMs. By employing an adaptive importance sampling technique, we greatly improve the efficiency of the sampling step. Though approximate, our easily parallelizable approach is yields comparable accuracy to state-of-the-art methods with substantial improvements in compute time on multi-core hardware. Our approach also flexibly accommodates both algorithmic enhancements (including improved learning algorithms for estimating conditional expectations) and extensions to non-standard cases such as inference from non-sufficient statistics. We demonstrate the performance of this approach on two well-known network data sets, comparing its accuracy and efficiency with results obtained using the approximate exchange algorithm. Our tests show a wallclock time advantage of up to 50% with five cores, and the ability to fit models in 1/5th the time at 30 cores; further speed enhancements are possible when more cores are available.

下载PDF全文

下载文献需遵守相关版权规定

论文标题