论文标题
使用二进制树核的对数线性时间高斯过程
Log-Linear-Time Gaussian Processes Using Binary Tree Kernels
论文作者
论文摘要
高斯流程(GPS)产生良好的概率模型,但是大多数GP内核都需要$ O(((n+m)n^2)$时间,其中$ n $是数据点的数量和$ m $预测位置的数量。我们提出了一个新的内核,该内核允许在$ o((n+m)\ log(n+m))$时间中进行高斯过程回归。我们的“二进制树”内核将所有数据点放在二进制树的叶子上,而核仅取决于最深的共同祖先的深度。我们可以将产生的内核矩阵存储在$ O(n)$ space中的$ O(n \ log n)$时间中,作为稀疏级别矩阵的总和,大约在$ o(n)$ time中将内核矩阵倒转。稀疏的GP方法还提供线性运行时间,但它们的预测不如更高的维内核。在经典的回归任务套件中,我们将内核与Matérn,稀疏和稀疏的变异核进行了比较。二进制树GP分配了多个数据集的测试数据的最高可能性,通常比稀疏方法较低的平方误差,并且经常与MatérnGP联系或击败MatérnGP。在大型数据集上,二进制树GP比MatérnGP快得多,并且快得多。
Gaussian processes (GPs) produce good probabilistic models of functions, but most GP kernels require $O((n+m)n^2)$ time, where $n$ is the number of data points and $m$ the number of predictive locations. We present a new kernel that allows for Gaussian process regression in $O((n+m)\log(n+m))$ time. Our "binary tree" kernel places all data points on the leaves of a binary tree, with the kernel depending only on the depth of the deepest common ancestor. We can store the resulting kernel matrix in $O(n)$ space in $O(n \log n)$ time, as a sum of sparse rank-one matrices, and approximately invert the kernel matrix in $O(n)$ time. Sparse GP methods also offer linear run time, but they predict less well than higher dimensional kernels. On a classic suite of regression tasks, we compare our kernel against Matérn, sparse, and sparse variational kernels. The binary tree GP assigns the highest likelihood to the test data on a plurality of datasets, usually achieves lower mean squared error than the sparse methods, and often ties or beats the Matérn GP. On large datasets, the binary tree GP is fastest, and much faster than a Matérn GP.