论文标题
一维的高斯过程的线性时间推断
Linear-time inference for Gaussian Processes on one dimension
论文作者
论文摘要
高斯流程(GPS)为插值,预测和平滑提供了强大的概率框架,但受到计算缩放问题的阻碍。在这里,我们研究了对一个维度采样的数据(例如,按任意间隔间隔采样的标量或矢量时间序列),由于其线性缩放的计算成本,因此状态空间模型很受欢迎。长期以来,人们一直认为,状态空间模型是一般的,能够近似任何一维GP。我们提供了这种猜想的第一个一般证据,表明,任何一个固定的GP都具有由Lebesgue综合连续内核控制的矢量值观测值,可以使用特定选择的状态空间模型将其近似于任何所需的精度:潜在的地球上生成(leg)。与一般状态空间模型相比,这个新家族具有多种优势:它始终是稳定的(无限生长),可以以封闭形式计算协方差,并且其参数空间不受限制(可以通过梯度下降直接估计)。定理的证明还与光谱混合物内核建立了联系,从而提供了有关这个受欢迎的内核家族的见解。我们开发了用于在腿部模型中执行推理和学习的并行化算法,测试对真实和合成数据的算法,并将缩放缩放到具有数十亿个样本的数据集中。
Gaussian Processes (GPs) provide powerful probabilistic frameworks for interpolation, forecasting, and smoothing, but have been hampered by computational scaling issues. Here we investigate data sampled on one dimension (e.g., a scalar or vector time series sampled at arbitrarily-spaced intervals), for which state-space models are popular due to their linearly-scaling computational costs. It has long been conjectured that state-space models are general, able to approximate any one-dimensional GP. We provide the first general proof of this conjecture, showing that any stationary GP on one dimension with vector-valued observations governed by a Lebesgue-integrable continuous kernel can be approximated to any desired precision using a specifically-chosen state-space model: the Latent Exponentially Generated (LEG) family. This new family offers several advantages compared to the general state-space model: it is always stable (no unbounded growth), the covariance can be computed in closed form, and its parameter space is unconstrained (allowing straightforward estimation via gradient descent). The theorem's proof also draws connections to Spectral Mixture Kernels, providing insight about this popular family of kernels. We develop parallelized algorithms for performing inference and learning in the LEG model, test the algorithm on real and synthetic data, and demonstrate scaling to datasets with billions of samples.