论文标题
用于大规模仿真实验的局部诱导的高斯过程
Locally induced Gaussian processes for large-scale simulation experiments
论文作者
论文摘要
高斯工艺(GPS)是复杂表面的灵活替代物,但在具有大训练数据大小的基质分解的立方成本下扣。地理空间和机器学习社区建议伪输入或诱导点,作为获得近似值的一种策略,以减轻计算负担。但是,我们展示了诱导点及其众多的位置如何受到病理的挫败,尤其是在大规模的动态响应表面建模任务中。作为补救措施,我们建议将诱导点的想法移植到通常在全球上应用的诱导点构想,以更轻松,更快地进行本地环境。通过这种方式,我们提出的方法杂交了全局诱导点和基于数据子集的本地GP近似。提供了一系列计划选择当地诱导点的策略,并将比较与相关方法进行比较,重点是计算机替代建模应用程序。我们表明,局部诱导点将其全局和数据材料组件零件扩展到精确度 - 计算效率前沿。在基准数据和大规模的实时模拟卫星阻力插值问题上提供了说明性示例。
Gaussian processes (GPs) serve as flexible surrogates for complex surfaces, but buckle under the cubic cost of matrix decompositions with big training data sizes. Geospatial and machine learning communities suggest pseudo-inputs, or inducing points, as one strategy to obtain an approximation easing that computational burden. However, we show how placement of inducing points and their multitude can be thwarted by pathologies, especially in large-scale dynamic response surface modeling tasks. As remedy, we suggest porting the inducing point idea, which is usually applied globally, over to a more local context where selection is both easier and faster. In this way, our proposed methodology hybridizes global inducing point and data subset-based local GP approximation. A cascade of strategies for planning the selection of local inducing points is provided, and comparisons are drawn to related methodology with emphasis on computer surrogate modeling applications. We show that local inducing points extend their global and data-subset component parts on the accuracy--computational efficiency frontier. Illustrative examples are provided on benchmark data and a large-scale real-simulation satellite drag interpolation problem.