论文标题
无限总和的自适应截断:统计的应用
Adaptive truncation of infinite sums: applications to Statistics
论文作者
论文摘要
在统计数据中通常需要计算无限级数的总和,尤其是在离散的潜在变量上边缘化时。这与贝叶斯推论环境中基于梯度的技术(例如哈密顿蒙特卡洛)的普及变得更加相关,为此,离散的潜在变量很难或不可能处理。对于许多常用的无限系列,已经开发了自定义算法,从而利用了每个问题的特定特征。一般技术,适用于用户输入有限的大量问题的一般技术的确定性较低。我们采用无限序列理论的基本结果来研究一般问题,问题不合算法,以在任意公差$ \ varepsilon> 0 $中截断无限的金额,并提供具有可证明保证的强大计算实现。我们将三种暂定解决方案比较估计无限利益之和:(i)一种“幼稚”方法,该方法将术语列为术语,直到条款低于阈值$ \ varepsilon $; (ii)基于捕获两个部分总和之间的真实值的“边界对”策略; (iii)一种“批处理”策略,该策略定期计算部分总和,并在其差异小于$ \ varepsilon $时停止。我们在哪些条件下显示,每种策略保证截短的总和在所需的公差范围内,并比较每种方法所达到的误差,以及每个方法所需的函数评估数量。还提供了有关实际实施中数值问题的详细讨论。本文提供了一些关于各种统计应用的理论讨论,包括具有观察错误的原始和阶乘时刻和计数模型。最后,提出了贝叶斯推断和最大边缘可能性估计的嘈杂MCMC形式的详细说明。
It is often the case in Statistics that one needs to compute sums of infinite series, especially in marginalising over discrete latent variables. This has become more relevant with the popularization of gradient-based techniques (e.g. Hamiltonian Monte Carlo) in the Bayesian inference context, for which discrete latent variables are hard or impossible to deal with. For many commonly used infinite series, custom algorithms have been developed which exploit specific features of each problem. General techniques, suitable for a large class of problems with limited input from the user are less established. We employ basic results from the theory of infinite series to investigate general, problem-agnostic algorithms to truncate infinite sums within an arbitrary tolerance $\varepsilon > 0$ and provide robust computational implementations with provable guarantees. We compare three tentative solutions to estimating the infinite sum of interest: (i) a "naive" approach that sums terms until the terms are below the threshold $\varepsilon$; (ii) a `bounding pair' strategy based on trapping the true value between two partial sums; and (iii) a `batch' strategy that computes the partial sums in regular intervals and stops when their difference is less than $\varepsilon$. We show under which conditions each strategy guarantees the truncated sum is within the required tolerance and compare the error achieved by each approach, as well as the number of function evaluations necessary for each one. A detailed discussion of numerical issues in practical implementations is also provided. The paper provides some theoretical discussion of a variety of statistical applications, including raw and factorial moments and count models with observation error. Finally, detailed illustrations in the form noisy MCMC for Bayesian inference and maximum marginal likelihood estimation are presented.