论文标题
内核方法及其衍生物:地球系统科学的概念和观点
Kernel Methods and their derivatives: Concept and perspectives for the Earth system sciences
论文作者
论文摘要
内核方法是功能强大的机器学习技术,可以实现通用的非线性功能,以简单的方式解决复杂的任务。它们具有坚实的数学背景,并且在实践中表现出色。但是,内核机仍然被认为是黑框模型,因为该功能映射无法直接访问且难以解释。这项工作的目的是表明确实可以解释各种内核方法学到的功能,尽管它们的复杂性很复杂。具体而言,我们表明这些功能的导数具有简单的数学公式,易于计算,并且可以应用于许多不同的问题。我们注意到,内核计算机中的模型函数衍生物与内核函数衍生物成正比。我们提供了最常见的内核函数的第一和第二个衍生物的显式分析形式,以计算高阶衍生物,以计算输入和通用公式。我们使用它们来分析最常用的监督和无监督的内核学习方法:回归的高斯过程,用于分类的载体机器,用于估计密度估计的内核熵成分分析以及Hilbert-Schmidt独立标准,用于估计随机变量之间的依赖性。对于所有情况,我们将学习函数的衍生物表示为内核函数衍生物的线性组合。此外,我们通过说明性玩具示例提供了直观的解释,并展示了如何在时空地球系统数据方面改善对真实应用的解释。这项工作反映了以下观察结果,即功能衍生物可能在内核方法分析和理解中起着至关重要的作用。
Kernel methods are powerful machine learning techniques which implement generic non-linear functions to solve complex tasks in a simple way. They Have a solid mathematical background and exhibit excellent performance in practice. However, kernel machines are still considered black-box models as the feature mapping is not directly accessible and difficult to interpret.The aim of this work is to show that it is indeed possible to interpret the functions learned by various kernel methods is intuitive despite their complexity. Specifically, we show that derivatives of these functions have a simple mathematical formulation, are easy to compute, and can be applied to many different problems. We note that model function derivatives in kernel machines is proportional to the kernel function derivative. We provide the explicit analytic form of the first and second derivatives of the most common kernel functions with regard to the inputs as well as generic formulas to compute higher order derivatives. We use them to analyze the most used supervised and unsupervised kernel learning methods: Gaussian Processes for regression, Support Vector Machines for classification, Kernel Entropy Component Analysis for density estimation, and the Hilbert-Schmidt Independence Criterion for estimating the dependency between random variables. For all cases we expressed the derivative of the learned function as a linear combination of the kernel function derivative. Moreover we provide intuitive explanations through illustrative toy examples and show how to improve the interpretation of real applications in the context of spatiotemporal Earth system data cubes. This work reflects on the observation that function derivatives may play a crucial role in kernel methods analysis and understanding.