论文标题
将高斯流程回归进行流数据拆分
Splitting Gaussian Process Regression for Streaming Data
论文作者
论文摘要
高斯流程提供了一种灵活的内核方法来进行回归。尽管高斯工艺具有许多有用的理论属性,并且实际上已经有用,但它们的观察次数却很差。特别是,更新标准高斯流程模型的立方时间复杂性通常使它们不适合应用流数据。我们提出了一种算法,用于将输入空间顺序划分,并将局部高斯过程拟合到每个不相交区域。该算法被证明比现有方法具有较高的时间和空间复杂性,其顺序性质允许将其应用于流数据。该算法构建了一个模型,该模型的更新时间复杂性在上面由预先指定的参数紧密界定。据我们所知,该模型是实现线性内存复杂性的第一个本地高斯过程回归模型。该模型的理论连续性特性已得到证明。我们证明了所得模型对流数据的多维回归任务的功效。
Gaussian processes offer a flexible kernel method for regression. While Gaussian processes have many useful theoretical properties and have proven practically useful, they suffer from poor scaling in the number of observations. In particular, the cubic time complexity of updating standard Gaussian process models make them generally unsuitable for application to streaming data. We propose an algorithm for sequentially partitioning the input space and fitting a localized Gaussian process to each disjoint region. The algorithm is shown to have superior time and space complexity to existing methods, and its sequential nature permits application to streaming data. The algorithm constructs a model for which the time complexity of updating is tightly bounded above by a pre-specified parameter. To the best of our knowledge, the model is the first local Gaussian process regression model to achieve linear memory complexity. Theoretical continuity properties of the model are proven. We demonstrate the efficacy of the resulting model on multi-dimensional regression tasks for streaming data.